Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Statistics in a Nutshell: A Desktop Quick Reference


Sarah Boslaugh - 2008
    This book gives you a solid understanding of statistics without being too simple, yet without the numbing complexity of most college texts. You get a firm grasp of the fundamentals and a hands-on understanding of how to apply them before moving on to the more advanced material that follows. Each chapter presents you with easy-to-follow descriptions illustrated by graphics, formulas, and plenty of solved examples. Before you know it, you'll learn to apply statistical reasoning and statistical techniques, from basic concepts of probability and hypothesis testing to multivariate analysis. Organized into four distinct sections, Statistics in a Nutshell offers you:Introductory material: Different ways to think about statistics Basic concepts of measurement and probability theoryData management for statistical analysis Research design and experimental design How to critique statistics presented by others Basic inferential statistics: Basic concepts of inferential statistics The concept of correlation, when it is and is not an appropriate measure of association Dichotomous and categorical data The distinction between parametric and nonparametric statistics Advanced inferential techniques: The General Linear Model Analysis of Variance (ANOVA) and MANOVA Multiple linear regression Specialized techniques: Business and quality improvement statistics Medical and public health statistics Educational and psychological statistics Unlike many introductory books on the subject, Statistics in a Nutshell doesn't omit important material in an effort to dumb it down. And this book is far more practical than most college texts, which tend to over-emphasize calculation without teaching you when and how to apply different statistical tests. With Statistics in a Nutshell, you learn how to perform most common statistical analyses, and understand statistical techniques presented in research articles. If you need to know how to use a wide range of statistical techniques without getting in over your head, this is the book you want.

The Signal and the Noise: Why So Many Predictions Fail—But Some Don't


Nate Silver - 2012
    He solidified his standing as the nation's foremost political forecaster with his near perfect prediction of the 2012 election. Silver is the founder and editor in chief of FiveThirtyEight.com. Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the "prediction paradox": The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good-or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary-and dangerous-science.Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.

Dear Data


Giorgia Lupi - 2016
    The result is described as “a thought-provoking visual feast”.

Probability And Statistics For Engineers And Scientists


Ronald E. Walpole - 1978
     Offers extensively updated coverage, new problem sets, and chapter-ending material to enhance the book’s relevance to today’s engineers and scientists. Includes new problem sets demonstrating updated applications to engineering as well as biological, physical, and computer science. Emphasizes key ideas as well as the risks and hazards associated with practical application of the material. Includes new material on topics including: difference between discrete and continuous measurements; binary data; quartiles; importance of experimental design; “dummy” variables; rules for expectations and variances of linear functions; Poisson distribution; Weibull and lognormal distributions; central limit theorem, and data plotting. Introduces Bayesian statistics, including its applications to many fields. For those interested in learning more about probability and statistics.

Reinforcement Learning: An Introduction


Richard S. Sutton - 1998
    Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

Science and Method


Henri Poincaré - 1908
    Drawing on examples from many fields, it explains how scientists analyze and choose their working facts, and it explores the nature of experimentation, theory, and the mind. 1914 edition.

Mathematical Statistics with Applications (Mathematical Statistics (W/ Applications))


Dennis D. Wackerly - 1995
    Premiere authors Dennis Wackerly, William Mendenhall, and Richard L. Scheaffer present a solid foundation in statistical theory while conveying the relevance and importance of the theory in solving practical problems in the real world. The authors' use of practical applications and excellent exercises helps readers discover the nature of statistics and understand its essential role in scientific research.

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences


Jacob Cohen - 1975
    Readers profit from its verbal-conceptual exposition and frequent use of examples.The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying website with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT, at www.psypress.com/9780805822236 .Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researchers in psychology, education, health sciences, communications, business, sociology, political science, anthropology, and economics. An introductory knowledge of statistics is required. Self-standing chapters minimize the need for researchers to refer to previous chapters.

Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems


Peter Dayan - 2001
    This text introduces the basic mathematical and computational methods of theoretical neuroscience and presents applications in a variety of areas including vision, sensory-motor integration, development, learning, and memory.The book is divided into three parts. Part I discusses the relationship between sensory stimuli and neural responses, focusing on the representation of information by the spiking activity of neurons. Part II discusses the modeling of neurons and neural circuits on the basis of cellular and synaptic biophysics. Part III analyzes the role of plasticity in development and learning. An appendix covers the mathematical methods used, and exercises are available on the book's Web site.

Probability and Statistics


Morris H. DeGroot - 1975
    Other new features include a chapter on simulation, a section on Gibbs sampling, what you should know boxes at the end of each chapter, and remarks to highlight difficult concepts.

Machine Learning for Hackers


Drew Conway - 2012
    Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyze sample datasets and write simple machine learning algorithms. "Machine Learning for Hackers" is ideal for programmers from any background, including business, government, and academic research.Develop a naive Bayesian classifier to determine if an email is spam, based only on its textUse linear regression to predict the number of page views for the top 1,000 websitesLearn optimization techniques by attempting to break a simple letter cipherCompare and contrast U.S. Senators statistically, based on their voting recordsBuild a "whom to follow" recommendation system from Twitter data

Mathematical Statistics and Data Analysis


John A. Rice - 1988
    The book's approach interweaves traditional topics with data analysis and reflects the use of the computer with close ties to the practice of statistics. The author stresses analysis of data, examines real problems with real data, and motivates the theory. The book's descriptive statistics, graphical displays, and realistic applications stand in strong contrast to traditional texts which are set in abstract settings.

Real Analysis


H.L. Royden - 1963
    Dealing with measure theory and Lebesque integration, this is an introductory graduate text.

Abstract Algebra


I.N. Herstein - 1986
    Providing a concise introduction to abstract algebra, this work unfolds some of the fundamental systems with the aim of reaching applicable, significant results.