An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Statistical Inference


George Casella - 2001
    Starting from the basics of probability, the authors develop the theory of statistical inference using techniques, definitions, and concepts that are statistical and are natural extensions and consequences of previous concepts. This book can be used for readers who have a solid mathematics background. It can also be used in a way that stresses the more practical uses of statistical theory, being more concerned with understanding basic statistical concepts and deriving reasonable statistical procedures for a variety of situations, and less concerned with formal optimality investigations.

Mostly Harmless Econometrics: An Empiricist's Companion


Joshua D. Angrist - 2008
    In the modern experimentalist paradigm, these techniques address clear causal questions such as: Do smaller classes increase learning? Should wife batterers be arrested? How much does education raise wages? Mostly Harmless Econometrics shows how the basic tools of applied econometrics allow the data to speak.In addition to econometric essentials, Mostly Harmless Econometrics covers important new extensions--regression-discontinuity designs and quantile regression--as well as how to get standard errors right. Joshua Angrist and Jorn-Steffen Pischke explain why fancier econometric techniques are typically unnecessary and even dangerous. The applied econometric methods emphasized in this book are easy to use and relevant for many areas of contemporary social science.An irreverent review of econometric essentials A focus on tools that applied researchers use most Chapters on regression-discontinuity designs, quantile regression, and standard errors Many empirical examples A clear and concise resource with wide applications

R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics


Paul Teetor - 2011
    The R language provides everything you need to do statistical work, but its structure can be difficult to master. This collection of concise, task-oriented recipes makes you productive with R immediately, with solutions ranging from basic tasks to input and output, general statistics, graphics, and linear regression.Each recipe addresses a specific problem, with a discussion that explains the solution and offers insight into how it works. If you're a beginner, R Cookbook will help get you started. If you're an experienced data programmer, it will jog your memory and expand your horizons. You'll get the job done faster and learn more about R in the process.Create vectors, handle variables, and perform other basic functionsInput and output dataTackle data structures such as matrices, lists, factors, and data framesWork with probability, probability distributions, and random variablesCalculate statistics and confidence intervals, and perform statistical testsCreate a variety of graphic displaysBuild statistical models with linear regressions and analysis of variance (ANOVA)Explore advanced statistical techniques, such as finding clusters in your dataWonderfully readable, R Cookbook serves not only as a solutions manual of sorts, but as a truly enjoyable way to explore the R language--one practical example at a time.--Jeffrey Ryan, software consultant and R package author

More Letters From The Pit: Stories of a Physician’S Odyssey in Emergency Medicine


Patrick J. Crocker - 2020
    

The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives


Stephen Thomas Ziliak - 2008
    If it takes a book to get it across, I hope this book will do it. It ought to.”—Thomas Schelling, Distinguished University Professor, School of Public Policy, University of Maryland, and 2005 Nobel Prize Laureate in Economics “With humor, insight, piercing logic and a nod to history, Ziliak and McCloskey show how economists—and other scientists—suffer from a mass delusion about statistical analysis. The quest for statistical significance that pervades science today is a deeply flawed substitute for thoughtful analysis. . . . Yet few participants in the scientific bureaucracy have been willing to admit what Ziliak and McCloskey make clear: the emperor has no clothes.”—Kenneth Rothman, Professor of Epidemiology, Boston University School of Health The Cult of Statistical Significance shows, field by field, how “statistical significance,” a technique that dominates many sciences, has been a huge mistake. The authors find that researchers in a broad spectrum of fields, from agronomy to zoology, employ “testing” that doesn’t test and “estimating” that doesn’t estimate. The facts will startle the outside reader: how could a group of brilliant scientists wander so far from scientific magnitudes? This study will encourage scientists who want to know how to get the statistical sciences back on track and fulfill their quantitative promise. The book shows for the first time how wide the disaster is, and how bad for science, and it traces the problem to its historical, sociological, and philosophical roots. Stephen T. Ziliak is the author or editor of many articles and two books. He currently lives in Chicago, where he is Professor of Economics at Roosevelt University. Deirdre N. McCloskey, Distinguished Professor of Economics, History, English, and Communication at the University of Illinois at Chicago, is the author of twenty books and three hundred scholarly articles. She has held Guggenheim and National Humanities Fellowships. She is best known for How to Be Human* Though an Economist (University of Michigan Press, 2000) and her most recent book, The Bourgeois Virtues: Ethics for an Age of Commerce (2006).

Project Puffin: The Improbable Quest to Bring a Beloved Seabird Back to Egg Rock


Stephen W. Kress - 1997
    As a young ornithology instructor at the Hog Island Audubon Camp, Dr. Stephen W. Kress learned that puffins had nested on nearby islands until extirpated by hunters in the late 1800s. To right this environmental wrong, he resolved to bring puffins back to one such island—Eastern Egg Rock. Yet bringing the plan to reality meant convincing skeptics, finding resources, and inventing restoration methods at a time when many believed in “letting nature take its course.” Today, Project Puffin has restored more than 1,000 puffin pairs to three Maine islands. But even more exciting, techniques developed during the project have helped to restore rare and endangered seabirds worldwide. Further, reestablished puffins now serve as a window into the effects of global warming. The success of Dr. Kress’s project offers hope that people can restore lost wildlife populations and the habitats that support them. The need for such inspiration has never been greater.

Information Theory, Inference and Learning Algorithms


David J.C. MacKay - 2002
    These topics lie at the heart of many exciting areas of contemporary science and engineering - communication, signal processing, data mining, machine learning, pattern recognition, computational neuroscience, bioinformatics, and cryptography. This textbook introduces theory in tandem with applications. Information theory is taught alongside practical communication systems, such as arithmetic coding for data compression and sparse-graph codes for error-correction. A toolbox of inference techniques, including message-passing algorithms, Monte Carlo methods, and variational approximations, are developed alongside applications of these tools to clustering, convolutional codes, independent component analysis, and neural networks. The final part of the book describes the state of the art in error-correcting codes, including low-density parity-check codes, turbo codes, and digital fountain codes -- the twenty-first century standards for satellite communications, disk drives, and data broadcast. Richly illustrated, filled with worked examples and over 400 exercises, some with detailed solutions, David MacKay's groundbreaking book is ideal for self-learning and for undergraduate or graduate courses. Interludes on crosswords, evolution, and sex provide entertainment along the way. In sum, this is a textbook on information, communication, and coding for a new generation of students, and an unparalleled entry point into these subjects for professionals in areas as diverse as computational biology, financial engineering, and machine learning.

The Best American Science and Nature Writing 2001


Edward O. Wilson - 2001
    Wilson, promises to be another “eclectic, provocative collection” (Entertainment Weekly) that is both a science reader’s dream and a nature lover’s sustenance.Iterations of immortality / David Berlinski --To save a watering hole / Mark Cherrington --New life in a death trap / Edwin Dobb --Abortion and brain waves / Gregg Easterbrook --Baby steps / Malcolm Gladwell --In the forests of Gombe / Jane Goodall --The doubting disease / Jerome Groopman --The recycled generation --Stephen S. Hall --Endurance predator / Bernd Heinrich --Harpy eagles / Edward Hoagland --Why the future doesn't need us / Bill Joy --A killing at dawn / Ted Kerasote --Seeing scarlet / Barbara Kingsolver and Steven Hopp --The best clock in the world / Verlyn Klinkenborg --The wild world's Scotland Yard / Jon R. Luoma --Breeding discontent / Cynthia Mills --Ice station Vostok / Oliver Morton --Being prey / Val Plumwood --Troubled waters / Sandra Postel --The genome warrior / Richard Preston --Megatransect / David Quammen --Inside the volcano / Donovan Webster

Statistical Rethinking: A Bayesian Course with Examples in R and Stan


Richard McElreath - 2015
    Reflecting the need for even minor programming in today's model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work.The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling.Web ResourceThe book is accompanied by an R package (rethinking) that is available on the author's website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.

Data Science for Business: What you need to know about data mining and data-analytic thinking


Foster Provost - 2013
    This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

Quantifying the User Experience: Practical Statistics for User Research


Jeff Sauro - 2012
    Many designers and researchers view usability and design as qualitative activities, which do not require attention to formulas and numbers. However, usability practitioners and user researchers are increasingly expected to quantify the benefits of their efforts. The impact of good and bad designs can be quantified in terms of conversions, completion rates, completion times, perceived satisfaction, recommendations, and sales.The book discusses ways to quantify user research; summarize data and compute margins of error; determine appropriate samples sizes; standardize usability questionnaires; and settle controversies in measurement and statistics. Each chapter concludes with a list of key points and references. Most chapters also include a set of problems and answers that enable readers to test their understanding of the material. This book is a valuable resource for those engaged in measuring the behavior and attitudes of people during their interaction with interfaces.

Silence of the Songbirds: How We Are Losing the World's Songbirds and What We Can Do to Save Them


Bridget Stutchbury - 2007
    By some estimates, we may already have lost almost half of the songbirds that filled the skies only forty years ago. Renowned biologist Bridget Stutchbury convincingly argues that songbirds truly are the "canaries in the coal mine"--except the coal mine looks a lot like Earth and we are the hapless excavators.Following the birds on their six-thousand-mile migratory journey, Stutchbury leads us on an ecological field trip to explore firsthand the major threats to songbirds: pesticides, still a major concern decades after Rachel Carson first raised the alarm; the destruction of vital habitat, from the boreal forests of Canada to the diminishing continuous forests of the United States to the grasslands of Argentina; coffee plantations, which push birds out of their forest refuges so we can have our morning fix; the bright lights and structures in our cities, which prove a minefield for migrating birds; and global warming. We could well wake up in the near future and hear no songbirds singing. But we won't just be missing their cheery calls, we'll be missing a vital part of our ecosystem. Without songbirds, our forests would face uncontrolled insect infestations, and our trees, flowers, and gardens would lose a crucial element in their reproductive cycle. As Stutchbury shows, saving songbirds means protecting our ecosystem and ultimately ourselves.Some of the threats to songbirds: - The U.S. annually uses 4-5 million pounds of active ingredient acephate, an insecticide that, even in small quantities, throws off the navigation systems of White-throated sparrows and other songbirds, making them unable to tell north from south. - The U.S. Fish and Wildlife Service conservatively estimated that 4-5 million birds are killed by crashing into communication towers each year.- A Michigan study found that 600 domestic cats killed more than 6,000 birds during a typical 10-week breeding season. Wood thrush, Kentucky warbler, the Eastern kingbird--migratory songbirds are disappearing at a frightening rate. By some estimates, we may already have lost almost half of the songbirds that filled the skies only forty years ago. Renowned biologist Bridget Stutchbury convincingly argues that songbirds truly are the "canaries in the coal mine"--except the coal mine looks a lot like Earth and we are the hapless excavators.Following the birds on their six-thousand-mile migratory journey, Stutchbury leads us on an ecological field trip to explore firsthand the major threats to songbirds: pesticides, still a major concern decades after Rachel Carson first raised the alarm; the destruction of vital habitat, from the boreal forests of Canada to the diminishing continuous forests of the United States to the grasslands of Argentina; coffee plantations, which push birds out of their forest refuges so we can have our morning fix; the bright lights and structures in our cities, which prove a minefield for migrating birds; and global warming. We could well wake up in the near future and hear no songbirds singing. But we won't just be missing their cheery calls, we'll be missing a vital part of our ecosystem. Without songbirds, our forests would face uncontrolled insect infestations, and our trees, flowers, and gardens would lose a crucial element in their reproductive cycle. As Stutchbury shows, saving songbirds means protecting our ecosystem and ultimately ourselves.Some of the threats to songbirds: - The U.S. annually uses 4-5 million pounds of active ingredient acephate, an insecticide that, even in small quantities, throws off the navigation systems of White-throated sparrows and other songbirds, making them unable to tell north from south. - The U.S. Fish and Wildlife Service conservatively estimated that 4-5 million birds are killed by crashing into communication towers each year.- A Michigan study found that 600 domestic cats killed more than 6,000 birds during a typical 10-week breeding season.

Data Visualisation: A Handbook for Data Driven Design


Andy Kirk - 2016
    Scholars and students need to be able to analyze, design and curate information into useful tools of communication, insight and understanding. This book is the starting point in learning the process and skills of data visualization, teaching the concepts and skills of how to present data and inspiring effective visual design. Benefits of this book: A flexible step-by-step journey that equips you to achieve great data visualization.A curated collection of classic and contemporary examples, giving illustrations of good and bad practice Examples on every page to give creative inspiration Illustrations of good and bad practice show you how to critically evaluate and improve your own work Advice and experience from the best designers in the field Loads of online practical help, checklists, case studies and exercises make this the most comprehensive text available

Schaum's Outline of Probability and Statistics


Murray R. Spiegel - 1975
    Its big-picture, calculus-based approach makes it an especially authoriatative reference for engineering and science majors. Now thoroughly update, this second edition includes vital new coverage of order statistics, best critical regions, likelihood ratio tests, and other key topics.