An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Superintelligence: Paths, Dangers, Strategies


Nick Bostrom - 2014
    The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. If machine brains surpassed human brains in general intelligence, then this new superintelligence could become extremely powerful--possibly beyond our control. As the fate of the gorillas now depends more on humans than on the species itself, so would the fate of humankind depend on the actions of the machine superintelligence.But we have one advantage: we get to make the first move. Will it be possible to construct a seed Artificial Intelligence, to engineer initial conditions so as to make an intelligence explosion survivable? How could one achieve a controlled detonation?

Machine Learning for Absolute Beginners


Oliver Theobald - 2017
    The manner in which computers are now able to mimic human thinking is rapidly exceeding human capabilities in everything from chess to picking the winner of a song contest. In the age of machine learning, computers do not strictly need to receive an ‘input command’ to perform a task, but rather ‘input data’. From the input of data they are able to form their own decisions and take actions virtually as a human would. But as a machine, can consider many more scenarios and execute calculations to solve complex problems. This is the element that excites companies and budding machine learning engineers the most. The ability to solve complex problems never before attempted. This is also perhaps one reason why you are looking at purchasing this book, to gain a beginner's introduction to machine learning. This book provides a plain English introduction to the following topics: - Artificial Intelligence - Big Data - Downloading Free Datasets - Regression - Support Vector Machine Algorithms - Deep Learning/Neural Networks - Data Reduction - Clustering - Association Analysis - Decision Trees - Recommenders - Machine Learning Careers This book has recently been updated following feedback from readers. Version II now includes: - New Chapter: Decision Trees - Cleanup of minor errors

Machine Learning in Action


Peter Harrington - 2011
    "Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.

Spark: The Definitive Guide: Big Data Processing Made Simple


Bill Chambers - 2018
    With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark’s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation

Applied Predictive Modeling


Max Kuhn - 2013
    Non- mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms. Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance-all of which are problems that occur frequently in practice. The text illustrates all parts of the modeling process through many hands-on, real-life examples. And every chapter contains extensive R code f

The Quick Python Book


Naomi R. Ceder - 2000
    This updated edition includes all the changes in Python 3, itself a significant shift from earlier versions of Python.The book begins with basic but useful programs that teach the core features of syntax, control flow, and data structures. It then moves to larger applications involving code management, object-oriented programming, web development, and converting code from earlier versions of Python.True to his audience of experienced developers, the author covers common programming language features concisely, while giving more detail to those features unique to Python.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Introduction to Information Retrieval


Christopher D. Manning - 2008
    Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

Grokking Deep Learning


Andrew W. Trask - 2017
    Loosely based on neuron behavior inside of human brains, these systems are rapidly catching up with the intelligence of their human creators, defeating the world champion Go player, achieving superhuman performance on video games, driving cars, translating languages, and sometimes even helping law enforcement fight crime. Deep Learning is a revolution that is changing every industry across the globe.Grokking Deep Learning is the perfect place to begin your deep learning journey. Rather than just learn the “black box” API of some library or framework, you will actually understand how to build these algorithms completely from scratch. You will understand how Deep Learning is able to learn at levels greater than humans. You will be able to understand the “brain” behind state-of-the-art Artificial Intelligence. Furthermore, unlike other courses that assume advanced knowledge of Calculus and leverage complex mathematical notation, if you’re a Python hacker who passed high-school algebra, you’re ready to go. And at the end, you’ll even build an A.I. that will learn to defeat you in a classic Atari game.

Machine Learning with R


Brett Lantz - 2014
    This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.

Deep Learning with Python


François Chollet - 2017
    It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.

Doing Math with Python


Amit Saha - 2015
    Python is easy to learn, and it's perfect for exploring topics like statistics, geometry, probability, and calculus. You’ll learn to write programs to find derivatives, solve equations graphically, manipulate algebraic expressions, even examine projectile motion.Rather than crank through tedious calculations by hand, you'll learn how to use Python functions and modules to handle the number crunching while you focus on the principles behind the math. Exercises throughout teach fundamental programming concepts, like using functions, handling user input, and reading and manipulating data. As you learn to think computationally, you'll discover new ways to explore and think about math, and gain valuable programming skills that you can use to continue your study of math and computer science.If you’re interested in math but have yet to dip into programming, you’ll find that Python makes it easy to go deeper into the subject—let Python handle the tedious work while you spend more time on the math.

The Fourth Paradigm: Data-Intensive Scientific Discovery


Tony Hey - 2009
    Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.

Introduction to Machine Learning with Python: A Guide for Data Scientists


Andreas C. Müller - 2015
    If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.You'll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Muller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.With this book, you'll learn:Fundamental concepts and applications of machine learningAdvantages and shortcomings of widely used machine learning algorithmsHow to represent data processed by machine learning, including which data aspects to focus onAdvanced methods for model evaluation and parameter tuningThe concept of pipelines for chaining models and encapsulating your workflowMethods for working with text data, including text-specific processing techniquesSuggestions for improving your machine learning and data science skills