How To: Absurd Scientific Advice for Common Real-World Problems


Randall Munroe - 2019
    How To is a guide to the third kind of approach. It's full of highly impractical advice for everything from landing a plane to digging a hole.Bestselling author and cartoonist Randall Munroe explains how to predict the weather by analyzing the pixels of your Facebook photos. He teaches you how to tell if you're a baby boomer or a 90's kid by measuring the radioactivity of your teeth. He offers tips for taking a selfie with a telescope, crossing a river by boiling it, and powering your house by destroying the fabric of space-time. And if you want to get rid of the book once you're done with it, he walks you through your options for proper disposal, including dissolving it in the ocean, converting it to a vapor, using tectonic plates to subduct it into the Earth's mantle, or launching it into the Sun.By exploring the most complicated ways to do simple tasks, Munroe doesn't just make things difficult for himself and his readers. As he did so brilliantly in What If?, Munroe invites us to explore the most absurd reaches of the possible. Full of clever infographics and amusing illustrations, How To is a delightfully mind-bending way to better understand the science and technology underlying the things we do every day.

The Seven Pillars of Statistical Wisdom


Stephen M. Stigler - 2016
    It allows one to gain information by discarding information, namely, the individuality of the observations. Stigler s second pillar, information measurement, challenges the importance of big data by noting that observations are not all equally important: the amount of information in a data set is often proportional to only the square root of the number of observations, not the absolute number. The third idea is likelihood, the calibration of inferences with the use of probability. Intercomparison is the principle that statistical comparisons do not need to be made with respect to an external standard. The fifth pillar is regression, both a paradox (tall parents on average produce shorter children; tall children on average have shorter parents) and the basis of inference, including Bayesian inference and causal reasoning. The sixth concept captures the importance of experimental design for example, by recognizing the gains to be had from a combinatorial approach with rigorous randomization. The seventh idea is the residual the notion that a complicated phenomenon can be simplified by subtracting the effect of known causes, leaving a residual phenomenon that can be explained more easily.The Seven Pillars of Statistical Wisdom presents an original, unified account of statistical science that will fascinate the interested layperson and engage the professional statistician."

The Beginning of Infinity: Explanations That Transform the World


David Deutsch - 2011
    Taking us on a journey through every fundamental field of science, as well as the history of civilization, art, moral values, and the theory of political institutions, Deutsch tracks how we form new explanations and drop bad ones, explaining the conditions under which progress—which he argues is potentially boundless—can and cannot happen. Hugely ambitious and highly original, The Beginning of Infinity explores and establishes deep connections between the laws of nature, the human condition, knowledge, and the possibility for progress.

Programming Collective Intelligence: Building Smart Web 2.0 Applications


Toby Segaran - 2002
    With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:Collaborative filtering techniques that enable online retailers to recommend products or media Methods of clustering to detect groups of similar items in a large dataset Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm Optimization algorithms that search millions of possible solutions to a problem and choose the best one Bayesian filtering, used in spam filters for classifying documents based on word types and other features Using decision trees not only to make predictions, but to model the way decisions are made Predicting numerical values rather than classifications to build price models Support vector machines to match people in online dating sites Non-negative matrix factorization to find the independent features in a dataset Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a game Each chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."-- Tim Wolters, CTO, Collective Intellect

Cybernetics: or the Control and Communication in the Animal and the Machine


Norbert Wiener - 1948
    It is a ‘ must’ book for those in every branch of science . . . in addition, economists, politicians, statesmen, and businessmen cannot afford to overlook cybernetics and its tremendous, even terrifying implications. "It is a beautifully written book, lucid, direct, and despite its complexity, as readable by the layman as the trained scientist." -- John B. Thurston, "The Saturday Review of Literature" Acclaimed one of the "seminal books . . . comparable in ultimate importance to . . . Galileo or Malthus or Rousseau or Mill," "Cybernetics" was judged by twenty-seven historians, economists, educators, and philosophers to be one of those books published during the "past four decades", which may have a substantial impact on public thought and action in the years ahead." -- Saturday Review

All of Statistics: A Concise Course in Statistical Inference


Larry Wasserman - 2003
    But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.

Prisoner's Dilemma: John von Neumann, Game Theory, and the Puzzle of the Bomb


William Poundstone - 1992
    Though the answers may seem simple, their profound implications make the prisoner's dilemma one of the great unifying concepts of science. Watching players bluff in a poker game inspired John von Neumann--father of the modern computer and one of the sharpest minds of the century--to construct game theory, a mathematical study of conflict and deception. Game theory was readily embraced at the RAND Corporation, the archetypical think tank charged with formulating military strategy for the atomic age, and in 1950 two RAND scientists made a momentous discovery.Called the prisoner's dilemma, it is a disturbing and mind-bending game where two or more people may betray the common good for individual gain. Introduced shortly after the Soviet Union acquired the atomic bomb, the prisoner's dilemma quickly became a popular allegory of the nuclear arms race. Intellectuals such as von Neumann and Bertrand Russell joined military and political leaders in rallying to the preventive war movement, which advocated a nuclear first strike against the Soviet Union. Though the Truman administration rejected preventive war the United States entered into an arms race with the Soviets and game theory developed into a controversial tool of public policy--alternately accused of justifying arms races and touted as the only hope of preventing them.A masterful work of science writing, Prisoner's Dilemma weaves together a biography of the brilliant and tragic von Neumann, a history of pivotal phases of the cold war, and an investigation of game theory's far-reaching influence on public policy today. Most important, Prisoner's Dilemma is the incisive story of a revolutionary idea that has been hailed as a landmark of twentieth-century thought.

Mostly Harmless Econometrics: An Empiricist's Companion


Joshua D. Angrist - 2008
    In the modern experimentalist paradigm, these techniques address clear causal questions such as: Do smaller classes increase learning? Should wife batterers be arrested? How much does education raise wages? Mostly Harmless Econometrics shows how the basic tools of applied econometrics allow the data to speak.In addition to econometric essentials, Mostly Harmless Econometrics covers important new extensions--regression-discontinuity designs and quantile regression--as well as how to get standard errors right. Joshua Angrist and Jorn-Steffen Pischke explain why fancier econometric techniques are typically unnecessary and even dangerous. The applied econometric methods emphasized in this book are easy to use and relevant for many areas of contemporary social science.An irreverent review of econometric essentials A focus on tools that applied researchers use most Chapters on regression-discontinuity designs, quantile regression, and standard errors Many empirical examples A clear and concise resource with wide applications

Statistics Without Tears: An Introduction for Non-Mathematicians


Derek Rowntree - 1981
    With it you can prime yourself with the key concepts of statistics before getting involved in the associated calculations. Using words and diagrams instead of figures, formulae and equations, Derek Rowntree makes statistics accessible to those who are non-mathematicians. And just to get you into the spirit of things. Rowntree has included questions in his argument; answer them as you go and you will be able to tell how far you have mastered the subject.

The Elements of Data Analytic Style


Jeffrey Leek - 2015
    This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. The author is one of the co-developers of the Johns Hopkins Specialization in Data Science the largest data science program in the world that has enrolled more than 1.76 million people. The book is useful as a companion to introductory courses in data science or data analysis. It is also a useful reference tool for people tasked with reading and critiquing data analyses. It is based on the authors popular open-source guides available through his Github account (https://github.com/jtleek). The paper is also available through Leanpub (https://leanpub.com/datastyle), if the book is purchased on that platform you are entitled to lifetime free updates.

The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies


Erik Brynjolfsson - 2014
    Digital technologies—with hardware, software, and networks at their core—will in the near future diagnose diseases more accurately than doctors can, apply enormous data sets to transform retailing, and accomplish many tasks once considered uniquely human.In The Second Machine Age MIT’s Erik Brynjolfsson and Andrew McAfee—two thinkers at the forefront of their field—reveal the forces driving the reinvention of our lives and our economy. As the full impact of digital technologies is felt, we will realize immense bounty in the form of dazzling personal technology, advanced infrastructure, and near-boundless access to the cultural items that enrich our lives.Amid this bounty will also be wrenching change. Professions of all kinds—from lawyers to truck drivers—will be forever upended. Companies will be forced to transform or die. Recent economic indicators reflect this shift: fewer people are working, and wages are falling even as productivity and profits soar.Drawing on years of research and up-to-the-minute trends, Brynjolfsson and McAfee identify the best strategies for survival and offer a new path to prosperity. These include revamping education so that it prepares people for the next economy instead of the last one, designing new collaborations that pair brute processing power with human ingenuity, and embracing policies that make sense in a radically transformed landscape.A fundamentally optimistic book, The Second Machine Age alters how we think about issues of technological, societal, and economic progress.

Numbers Rule Your World: The Hidden Influence of Probabilities and Statistics on Everything You Do


Kaiser Fung - 2010
    This is how engineers calculate your quality of living, how corporations determine your needs, and how politicians estimate your opinions. These are the numbers you never think about-even though they play a crucial role in every single aspect of your life.What you learn may surprise you, amuse you, or even enrage you. But there's one thing you won't be able to deny: Numbers Rule Your World...An easy read with a big benefit. --Fareed Zakaria, CNNFor those who have anxiety about how organization data-mining is impacting their world, Kaiser Fung pulls back the curtain to reveal the good and the bad of predictive analytics. --Ian Ayres, Yale professor and author of Super Crunchers: Why Thinking By Numbers is the New Way to Be Smart A book that engages us with stories that a journalist would write, the compelling stories behind the stories as illuminated by the numbers, and the dynamics that the numbers reveal. --John Sall, Executive Vice President, SAS InstituteLittle did I suspect, when I picked up Kaiser Fung's book, that I would become so entranced by it - an illuminating and accessible exploration of the power of statistical analysis for those of us who have no prior training in a field that he explores so ably. --Peter Clarke, author of Keynes: The Rise, Fall, and Return of the 20th Century's Most Influential EconomistA tremendous book. . . . If you want to understand how to use statistics, how to think with numbers and yet to do this without getting lost in equations, if you've been looking for the book to unlock the door to logical thinking about problems, well, you will be pleased to know that you are holding that book in your hands. --Daniel Finkelstein, Executive Editor, The Times of LondonI thoroughly enjoyed this accessible book and enthusiastically recommend it to anyone looking to understand and appreciate the role of statistics and data analysis in solving problems and in creating a better world. --Michael Sherman, Texas A&M University, American Statistician

Dear Data


Giorgia Lupi - 2016
    The result is described as “a thought-provoking visual feast”.

Code: The Hidden Language of Computer Hardware and Software


Charles Petzold - 1999
    And through CODE, we see how this ingenuity and our very human compulsion to communicate have driven the technological innovations of the past two centuries. Using everyday objects and familiar language systems such as Braille and Morse code, author Charles Petzold weaves an illuminating narrative for anyone who’s ever wondered about the secret inner life of computers and other smart machines. It’s a cleverly illustrated and eminently comprehensible story—and along the way, you’ll discover you’ve gained a real context for understanding today’s world of PCs, digital media, and the Internet. No matter what your level of technical savvy, CODE will charm you—and perhaps even awaken the technophile within.

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.