Natural Language Processing with Python


Steven Bird - 2009
    With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligenceThis book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Probabilistic Graphical Models: Principles and Techniques


Daphne Koller - 2009
    The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.

Pattern Recognition and Machine Learning


Christopher M. Bishop - 2006
    However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.

An Introduction to APIs


Brian Cooksey - 2016
    We start off easy, defining some of the tech lingo you may have heard before, but didn’t fully understand. From there, each lesson introduces something new, slowly building up to the point where you are confident about what an API is and, for the brave, could actually take a stab at using one.

The Model Thinker: What You Need to Know to Make Data Work for You


Scott E. Page - 2018
    But as anyone who has ever opened up a spreadsheet packed with seemingly infinite lines of data knows, numbers aren't enough: we need to know how to make those numbers talk. In The Model Thinker, social scientist Scott E. Page shows us the mathematical, statistical, and computational models—from linear regression to random walks and far beyond—that can turn anyone into a genius. At the core of the book is Page's "many-model paradigm," which shows the reader how to apply multiple models to organize the data, leading to wiser choices, more accurate predictions, and more robust designs. The Model Thinker provides a toolkit for business people, students, scientists, pollsters, and bloggers to make them better, clearer thinkers, able to leverage data and information to their advantage.

Pattern Classification


David G. Stork - 1973
    Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics.An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.

Learning From Data: A Short Course


Yaser S. Abu-Mostafa - 2012
    Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.

Async in C# 5.0


Alex Davies - 2012
    Along with a clear introduction to asynchronous programming, you get an in-depth look at how the async feature works and why you might want to use it in your application.Written for experienced C# programmers—yet approachable for beginners—this book is packed with code examples that you can extend for your own projects.Write your own asynchronous code, and learn how async saves you from this messy choreDiscover new performance possibilities in ASP.NET web server codeExplore how async and WinRT work together in Windows 8 applicationsLearn the importance of the await keyword in async methodsUnderstand which .NET thread is running your code—and at what points in the programUse the Task-based Asynchronous Pattern (TAP) to write asynchronous APIs in .NETTake advantage of parallel computing in modern machinesMeasure async code performance by comparing it with alternatives

Computer Age Statistical Inference: Algorithms, Evidence, and Data Science


Bradley Efron - 2016
    'Big data', 'data science', and 'machine learning' have become familiar terms in the news, as statistical methods are brought to bear upon the enormous data sets of modern science and commerce. How did we get here? And where are we going? This book takes us on an exhilarating journey through the revolution in data analysis following the introduction of electronic computation in the 1950s. Beginning with classical inferential theories - Bayesian, frequentist, Fisherian - individual chapters take up a series of influential topics: survival analysis, logistic regression, empirical Bayes, the jackknife and bootstrap, random forests, neural networks, Markov chain Monte Carlo, inference after model selection, and dozens more. The distinctly modern approach integrates methodology and algorithms with statistical inference. The book ends with speculation on the future direction of statistics and data science.

Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

The Art of SQL


Stephane Faroult - 2006
    Database performance has become a major headache, and most IT departments believe that developers should provide simple SQL code to solve immediate problems and let DBAs tune any bad SQL later.In The Art of SQL, author and SQL expert Stephane Faroult argues that this safe approach only leads to disaster. His insightful book, named after Art of War by Sun Tzu, contends that writing quick inefficient code is sweeping the dirt under the rug. SQL code may run for 5 to 10 years, surviving several major releases of the database management system and on several generations of hardware. The code must be fast and sound from the start, and that requires a firm understanding of SQL and relational theory.The Art of SQL offers best practices that teach experienced SQL users to focus on strategy rather than specifics. Faroult's approach takes a page from Sun Tzu's classic treatise by viewing database design as a military campaign. You need knowledge, skills, and talent. Talent can't be taught, but every strategist from Sun Tzu to modern-day generals believed that it can be nurtured through the experience of others. They passed on their experience acquired in the field through basic principles that served as guiding stars amid the sound and fury of battle. This is what Faroult does with SQL.Like a successful battle plan, good architectural choices are based on contingencies. What if the volume of this or that table increases unexpectedly? What if, following a merger, the number of users doubles? What if you want to keep several years of data online? Faroult's way of looking at SQL performance may be unconventional and unique, but he's deadly serious about writing good SQL and using SQL well. The Art of SQL is not a cookbook, listing problems and giving recipes. The aim is to get you-and your manager-to raise good questions.

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Artificial Intelligence


Patrick Henry Winston - 1977
    From the book, you learn why the field is important, both as a branch of engineering and as a science. If you are a computer scientist or an engineer, you will enjoy the book, because it provides a cornucopia of new ideas for representing knowledge, using knowledge, and building practical systems. If you are a psychologist, biologist, linguist, or philosopher, you will enjoy the book because it provides an exciting computational perspective on the mystery of intelligence. The Knowledge You Need This completely rewritten and updated edition of Artificial Intelligence reflects the revolutionary progress made since the previous edition was published. Part I is about representing knowledge and about reasoning methods that make use of knowledge. The material covered includes the semantic-net family of representations, describe and match, generate and test, means-ends analysis, problem reduction, basic search, optimal search, adversarial search, rule chaining, the rete algorithm, frame inheritance, topological sorting, constraint propagation, logic, truth

Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)


Jiawei Han - 2000
    Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site

AWS Lambda: A Guide to Serverless Microservices


Matthew Fuller - 2016
    Lambda enables users to develop code that executes in response to events - API calls, file uploads, schedules, etc - and upload it without worrying about managing traditional server metrics such as disk space, memory, or CPU usage. With its "per execution" cost model, Lambda can enable organizations to save hundreds or thousands of dollars on computing costs. With in-depth walkthroughs, large screenshots, and complete code samples, the reader is guided through the step-by-step process of creating new functions, responding to infrastructure events, developing API backends, executing code at specified intervals, and much more. Introduction to AWS Computing Evolution of the Computing Workload Lambda Background The Internals The Basics Functions Languages Resource Allocation Getting Set Up Hello World Uploading the Function Working with Events AWS Events Custom Events The Context Object Properties Methods Roles and Permissions Policies Trust Relationships Console Popups Cross Account Access Dependencies and Resources Node Modules OS Dependencies OS Resources OS Commands Logging Searching Logs Testing Your Function Lambda Console Tests Third-Party Testing Libraries Simulating Context Hello S3 Object The Bucket The Role The Code The Event The Trigger Testing When Lambda Isn’t the Answer Host Access Fine-Tuned Configuration Security Long-Running Tasks Where Lambda Excels AWS Event-Driven Tasks Scheduled Events (Cron) Offloading Heavy Processing API Endpoints Infrequently Used Services Real-World Use Cases S3 Image Processing Shutting Down Untagged Instances Triggering CodeDeploy with New S3 Uploads Processing Inbound Email Enforcing Security Policies Detecting Expiring Certificates Utilizing the AWS API Execution Environment The Code Pipeline Cold vs. Hot Execution What is Saved in Memory Scaling and Container Reuse From Development to Deployment Application Design Development Patterns Testing Deployment Monitoring Versioning and Aliasing Costs Short Executions Long-Running Processes High-Memory Applications Free Tier Calculating Pricing CloudFormation Reusable Template with Minimum Permissions Cross Account Access CloudWatch Alerts AWS API Gateway API Gateway Event Creating the Lambda Function Creating a New API, Resource, and Method Initial Configuration Mapping Templates Adding a Query String Using HTTP Request Information Within Lambda Deploying the API Additional Use Cases Lambda Competitors Iron.io StackHut WebTask.io Existing Cloud Providers The Future of Lambda More Resources Conclusion