Book picks similar to
Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python Guide for the Analysis of Survey Data by Željko Ivezić
astronomy
data-science
coding
computer-science
Make Your Own Neural Network
Tariq Rashid - 2016
Neural networks are a key element of deep learning and artificial intelligence, which today is capable of some truly impressive feats. Yet too few really understand how neural networks actually work. This guide will take you on a fun and unhurried journey, starting from very simple ideas, and gradually building up an understanding of how neural networks work. You won't need any mathematics beyond secondary school, and an accessible introduction to calculus is also included. The ambition of this guide is to make neural networks as accessible as possible to as many readers as possible - there are enough texts for advanced readers already! You'll learn to code in Python and make your own neural network, teaching it to recognise human handwritten numbers, and performing as well as professionally developed networks. Part 1 is about ideas. We introduce the mathematical ideas underlying the neural networks, gently with lots of illustrations and examples. Part 2 is practical. We introduce the popular and easy to learn Python programming language, and gradually builds up a neural network which can learn to recognise human handwritten numbers, easily getting it to perform as well as networks made by professionals. Part 3 extends these ideas further. We push the performance of our neural network to an industry leading 98% using only simple ideas and code, test the network on your own handwriting, take a privileged peek inside the mysterious mind of a neural network, and even get it all working on a Raspberry Pi. All the code in this has been tested to work on a Raspberry Pi Zero.
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
Hadley Wickham - 2016
This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:
Wrangle—transform your datasets into a form convenient for analysis
Program—learn powerful R tools for solving data problems with greater clarity and ease
Explore—examine your data, generate hypotheses, and quickly test them
Model—provide a low-dimensional summary that captures true "signals" in your dataset
Communicate—learn R Markdown for integrating prose, code, and results
Machine Learning for Hackers
Drew Conway - 2012
Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyze sample datasets and write simple machine learning algorithms. "Machine Learning for Hackers" is ideal for programmers from any background, including business, government, and academic research.Develop a naive Bayesian classifier to determine if an email is spam, based only on its textUse linear regression to predict the number of page views for the top 1,000 websitesLearn optimization techniques by attempting to break a simple letter cipherCompare and contrast U.S. Senators statistically, based on their voting recordsBuild a "whom to follow" recommendation system from Twitter data
Automate This: How Algorithms Came to Rule Our World
Christopher Steiner - 2012
It used to be that to diagnose an illness, interpret legal documents, analyze foreign policy, or write a newspaper article you needed a human being with specific skills—and maybe an advanced degree or two. These days, high-level tasks are increasingly being handled by algorithms that can do precise work not only with speed but also with nuance. These “bots” started with human programming and logic, but now their reach extends beyond what their creators ever expected. In this fascinating, frightening book, Christopher Steiner tells the story of how algorithms took over—and shows why the “bot revolution” is about to spill into every aspect of our lives, often silently, without our knowledge. The May 2010 “Flash Crash” exposed Wall Street’s reliance on trading bots to the tune of a 998-point market drop and $1 trillion in vanished market value. But that was just the beginning. In Automate This, we meet bots that are driving cars, penning haiku, and writing music mistaken for Bach’s. They listen in on our customer service calls and figure out what Iran would do in the event of a nuclear standoff. There are algorithms that can pick out the most cohesive crew of astronauts for a space mission or identify the next Jeremy Lin. Some can even ingest statistics from baseball games and spit out pitch-perfect sports journalism indistinguishable from that produced by humans. The interaction of man and machine can make our lives easier. But what will the world look like when algorithms control our hospitals, our roads, our culture, and our national security? What happens to businesses when we automate judgment and eliminate human instinct? And what role will be left for doctors, lawyers, writers, truck drivers, and many others? Who knows—maybe there’s a bot learning to do your job this minute.
Deep Learning with Python
François Chollet - 2017
It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.
Think Stats
Allen B. Downey - 2011
This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data
Bayes Theorem Examples: An Intuitive Guide
Scott Hartshorn - 2016
Essentially, you are estimating a probability, but then updating that estimate based on other things that you know. This book is designed to give you an intuitive understanding of how to use Bayes Theorem. It starts with the definition of what Bayes Theorem is, but the focus of the book is on providing examples that you can follow and duplicate. Most of the examples are calculated in Excel, which is useful for updating probability if you have dozens or hundreds of data points to roll in.
Data Science for Business: What you need to know about data mining and data-analytic thinking
Foster Provost - 2013
This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates
Causality: Models, Reasoning, and Inference
Judea Pearl - 2000
It shows how causality has grown from a nebulous concept into a mathematical theory with significant applications in the fields of statistics, artificial intelligence, philosophy, cognitive science, and the health and social sciences. Pearl presents a unified account of the probabilistic, manipulative, counterfactual and structural approaches to causation, and devises simple mathematical tools for analyzing the relationships between causal connections, statistical associations, actions and observations. The book will open the way for including causal analysis in the standard curriculum of statistics, artifical intelligence, business, epidemiology, social science and economics. Students in these areas will find natural models, simple identification procedures, and precise mathematical definitions of causal concepts that traditional texts have tended to evade or make unduly complicated. This book will be of interest to professionals and students in a wide variety of fields. Anyone who wishes to elucidate meaningful relationships from data, predict effects of actions and policies, assess explanations of reported events, or form theories of causal understanding and causal speech will find this book stimulating and invaluable. Professor of Computer Science at the UCLA, Judea Pearl is the winner of the 2008 Benjamin Franklin Award in Computers and Cognitive Science.
An Introduction to Statistical Learning: With Applications in R
Gareth James - 2013
This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists
Philipp K. Janert - 2010
With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Pedro Domingos - 2015
In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.
Machine Learning
Tom M. Mitchell - 1986
Mitchell covers the field of machine learning, the study of algorithms that allow computer programs to automatically improve through experience and that automatically infer general laws from specific data.
Learning From Data: A Short Course
Yaser S. Abu-Mostafa - 2012
Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.
All of Statistics: A Concise Course in Statistical Inference
Larry Wasserman - 2003
But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.