Statistics Done Wrong: The Woefully Complete Guide


Alex Reinhart - 2013
    Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.

Bayesian Data Analysis


Andrew Gelman - 1995
    Its world-class authors provide guidance on all aspects of Bayesian data analysis and include examples of real statistical analyses, based on their own research, that demonstrate how to solve complicated problems. Changes in the new edition include:Stronger focus on MCMC Revision of the computational advice in Part III New chapters on nonlinear models and decision analysis Several additional applied examples from the authors' recent research Additional chapters on current models for Bayesian data analysis such as nonlinear models, generalized linear mixed models, and more Reorganization of chapters 6 and 7 on model checking and data collectionBayesian computation is currently at a stage where there are many reasonable ways to compute any given posterior distribution. However, the best approach is not always clear ahead of time. Reflecting this, the new edition offers a more pluralistic presentation, giving advice on performing computations from many perspectives while making clear the importance of being aware that there are different ways to implement any given iterative simulation computation. The new approach, additional examples, and updated information make Bayesian Data Analysis an excellent introductory text and a reference that working scientists will use throughout their professional life.

Applied Multivariate Statistical Analysis


Richard A. Johnson - 1982
    of Wisconsin-Madison) and Wichern (Texas A&M U.) present the newest edition of this college text on the statistical methods for describing and analyzing multivariate data, designed for students who have taken two or more statistics courses. The fifth edition includes the addition of seve

Statistical Inference


George Casella - 2001
    Starting from the basics of probability, the authors develop the theory of statistical inference using techniques, definitions, and concepts that are statistical and are natural extensions and consequences of previous concepts. This book can be used for readers who have a solid mathematics background. It can also be used in a way that stresses the more practical uses of statistical theory, being more concerned with understanding basic statistical concepts and deriving reasonable statistical procedures for a variety of situations, and less concerned with formal optimality investigations.

Probability Theory: The Logic of Science


E.T. Jaynes - 1999
    It discusses new results, along with applications of probability theory to a variety of problems. The book contains many exercises and is suitable for use as a textbook on graduate-level courses involving data analysis. Aimed at readers already familiar with applied mathematics at an advanced undergraduate level or higher, it is of interest to scientists concerned with inference from incomplete information.

Think Bayes


Allen B. Downey - 2012
    

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks


Will Kurt - 2019
    But many people use data in ways they don't even understand, meaning they aren't getting the most from it. Bayesian Statistics the Fun Way will change that.This book will give you a complete understanding of Bayesian statistics through simple explanations and un-boring examples. Find out the probability of UFOs landing in your garden, how likely Han Solo is to survive a flight through an asteroid shower, how to win an argument about conspiracy theories, and whether a burglary really was a burglary, to name a few examples.By using these off-the-beaten-track examples, the author actually makes learning statistics fun. And you'll learn real skills, like how to:- How to measure your own level of uncertainty in a conclusion or belief- Calculate Bayes theorem and understand what it's useful for- Find the posterior, likelihood, and prior to check the accuracy of your conclusions- Calculate distributions to see the range of your data- Compare hypotheses and draw reliable conclusions from themNext time you find yourself with a sheaf of survey results and no idea what to do with them, turn to Bayesian Statistics the Fun Way to get the most value from your data.

The Cartoon Guide to Statistics


Larry Gonick - 1993
    Never again will you order the Poisson Distribution in a French restaurant!This updated version features all new material.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Information Theory, Inference and Learning Algorithms


David J.C. MacKay - 2002
    These topics lie at the heart of many exciting areas of contemporary science and engineering - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics, and cryptography. This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction. A toolbox of inference techniques, including message-passing algorithms, Monte Carlo methods, and variational approximations, are developed alongside applications of these tools to clustering, convolutional codes, independent component analysis, and neural networks. The final part of the book describes the state of the art in error-correcting codes, including low-density parity-check codes, turbo codes, and digital fountain codes -- the twenty-first century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, David MacKay's groundbreaking book is ideal for self-learning and for undergraduate or graduate courses. Interludes on crosswords, evolution, and sex provide entertainment along the way. In sum, this is a textbook on information, communication, and coding for a new generation of students, and an unparalleled entry point into these subjects for professionals in areas as diverse as computational biology, financial engineering, and machine learning.

Linear Algebra Done Right


Sheldon Axler - 1995
    The novel approach taken here banishes determinants to the end of the book and focuses on the central goal of linear algebra: understanding the structure of linear operators on vector spaces. The author has taken unusual care to motivate concepts and to simplify proofs. For example, the book presents - without having defined determinants - a clean proof that every linear operator on a finite-dimensional complex vector space (or an odd-dimensional real vector space) has an eigenvalue. A variety of interesting exercises in each chapter helps students understand and manipulate the objects of linear algebra. This second edition includes a new section on orthogonal projections and minimization problems. The sections on self-adjoint operators, normal operators, and the spectral theorem have been rewritten. New examples and new exercises have been added, several proofs have been simplified, and hundreds of minor improvements have been made throughout the text.

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Data Science for Business: What you need to know about data mining and data-analytic thinking


Foster Provost - 2013
    This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

The Hundred-Page Machine Learning Book


Andriy Burkov - 2019
    During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.