Big Data: A Revolution That Will Transform How We Live, Work, and Think


Viktor Mayer-Schönberger - 2013
    “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy


Sharon Bertsch McGrayne - 2011
    To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok.In the first-ever account of Bayes' rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years—at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information (Alan Turing's role in breaking Germany's Enigma code during World War II), and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes' rule is used everywhere from DNA de-coding to Homeland Security.Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time.

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics


Nathan Yau - 2011
    Wouldn't it be wonderful if we could actually visualize data in such a way that we could maximize its potential and tell a story in a clear, concise manner? Thanks to the creative genius of Nathan Yau, we can. With this full-color book, data visualization guru and author Nathan Yau uses step-by-step tutorials to show you how to visualize and tell stories with data. He explains how to gather, parse, and format data and then design high quality graphics that help you explore and present patterns, outliers, and relationships.Presents a unique approach to visualizing and telling stories with data, from a data visualization expert and the creator of flowingdata.com, Nathan Yau Offers step-by-step tutorials and practical design tips for creating statistical graphics, geographical maps, and information design to find meaning in the numbers Details tools that can be used to visualize data-native graphics for the Web, such as ActionScript, Flash libraries, PHP, and JavaScript and tools to design graphics for print, such as R and Illustrator Contains numerous examples and descriptions of patterns and outliers and explains how to show them Visualize This demonstrates how to explain data visually so that you can present your information in a way that is easy to understand and appealing.

Think Stats


Allen B. Downey - 2011
    This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data

Quantum Computing Since Democritus


Scott Aaronson - 2013
    Full of insights, arguments and philosophical perspectives, the book covers an amazing array of topics. Beginning in antiquity with Democritus, it progresses through logic and set theory, computability and complexity theory, quantum computing, cryptography, the information content of quantum states and the interpretation of quantum mechanics. There are also extended discussions about time travel, Newcomb's Paradox, the anthropic principle and the views of Roger Penrose. Aaronson's informal style makes this fascinating book accessible to readers with scientific backgrounds, as well as students and researchers working in physics, computer science, mathematics and philosophy.

Python Crash Course: A Hands-On, Project-Based Introduction to Programming


Eric Matthes - 2015
    You'll also learn how to make your programs interactive and how to test your code safely before adding it to a project. In the second half of the book, you'll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, data visualizations with Python's super-handy libraries, and a simple web app you can deploy online.As you work through Python Crash Course, you'll learn how to: Use powerful Python libraries and tools, including matplotlib, NumPy, and PygalMake 2D games that respond to keypresses and mouse clicks, and that grow more difficult as the game progressesWork with data to generate interactive visualizationsCreate and customize simple web apps and deploy them safely onlineDeal with mistakes and errors so you can solve your own programming problemsIf you've been thinking seriously about digging into programming, Python Crash Course will get you up to speed and have you writing real programs fast. Why wait any longer? Start your engines and code!

Humble Pi: A Comedy of Maths Errors


Matt Parker - 2019
    Most of the time this math works quietly behind the scenes . . . until it doesn't. All sorts of seemingly innocuous mathematical mistakes can have significant consequences.Math is easy to ignore until a misplaced decimal point upends the stock market, a unit conversion error causes a plane to crash, or someone divides by zero and stalls a battleship in the middle of the ocean.Exploring and explaining a litany of glitches, near misses, and mathematical mishaps involving the internet, big data, elections, street signs, lotteries, the Roman Empire, and an Olympic team, Matt Parker uncovers the bizarre ways math trips us up, and what this reveals about its essential place in our world. Getting it wrong has never been more fun.

Calling Bullshit: The Art of Skepticism in a Data-Driven World


Carl T. Bergstrom - 2020
    Now, two science professors give us the tools to dismantle misinformation and think clearly in a world of fake news and bad data.It's increasingly difficult to know what's true. Misinformation, disinformation, and fake news abound. Our media environment has become hyperpartisan. Science is conducted by press release. Startup culture elevates bullshit to high art. We are fairly well equipped to spot the sort of old-school bullshit that is based in fancy rhetoric and weasel words, but most of us don't feel qualified to challenge the avalanche of new-school bullshit presented in the language of math, science, or statistics. In Calling Bullshit, Professors Carl Bergstrom and Jevin West give us a set of powerful tools to cut through the most intimidating data.You don't need a lot of technical expertise to call out problems with data. Are the numbers or results too good or too dramatic to be true? Is the claim comparing like with like? Is it confirming your personal bias? Drawing on a deep well of expertise in statistics and computational biology, Bergstrom and West exuberantly unpack examples of selection bias and muddled data visualization, distinguish between correlation and causation, and examine the susceptibility of science to modern bullshit.We have always needed people who call bullshit when necessary, whether within a circle of friends, a community of scholars, or the citizenry of a nation. Now that bullshit has evolved, we need to relearn the art of skepticism.

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are


Seth Stephens-Davidowitz - 2017
    This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable.Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women?Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.

A Field Guide to Lies: Critical Thinking in the Information Age


Daniel J. Levitin - 2016
    We are bombarded with more information each day than our brains can process—especially in election season. It's raining bad data, half-truths, and even outright lies. New York Times bestselling author Daniel J. Levitin shows how to recognize misleading announcements, statistics, graphs, and written reports revealing the ways lying weasels can use them. It's becoming harder to separate the wheat from the digital chaff. How do we distinguish misinformation, pseudo-facts, distortions, and outright lies from reliable information? Levitin groups his field guide into two categories—statistical infomation and faulty arguments—ultimately showing how science is the bedrock of critical thinking. Infoliteracy means understanding that there are hierarchies of source quality and bias that variously distort our information feeds via every media channel, including social media. We may expect newspapers, bloggers, the government, and Wikipedia to be factually and logically correct, but they so often aren't. We need to think critically about the words and numbers we encounter if we want to be successful at work, at play, and in making the most of our lives. This means checking the plausibility and reasoning—not passively accepting information, repeating it, and making decisions based on it. Readers learn to avoid the extremes of passive gullibility and cynical rejection. Levitin's charming, entertaining, accessible guide can help anyone wake up to a whole lot of things that aren't so. And catch some lying weasels in their tracks!

Mining of Massive Datasets


Anand Rajaraman - 2011
    This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.

Gödel, Escher, Bach: An Eternal Golden Braid


Douglas R. Hofstadter - 1979
    However, according to Hofstadter, the formal system that underlies all mental activity transcends the system that supports it. If life can grow out of the formal chemical substrate of the cell, if consciousness can emerge out of a formal system of firing neurons, then so too will computers attain human intelligence. Gödel, Escher, Bach is a wonderful exploration of fascinating ideas at the heart of cognitive science: meaning, reduction, recursion, and much more.

The Art of Doing Science and Engineering: Learning to Learn


Richard Hamming - 1996
    By presenting actual experiences and analyzing them as they are described, the author conveys the developmental thought processes employed and shows a style of thinking that leads to successful results is something that can be learned. Along with spectacular successes, the author also conveys how failures contributed to shaping the thought processes. Provides the reader with a style of thinking that will enhance a person's ability to function as a problem-solver of complex technical issues. Consists of a collection of stories about the author's participation in significant discoveries, relating how those discoveries came about and, most importantly, provides analysis about the thought processes and reasoning that took place as the author and his associates progressed through engineering problems.

The Model Thinker: What You Need to Know to Make Data Work for You


Scott E. Page - 2018
    But as anyone who has ever opened up a spreadsheet packed with seemingly infinite lines of data knows, numbers aren't enough: we need to know how to make those numbers talk. In The Model Thinker, social scientist Scott E. Page shows us the mathematical, statistical, and computational models—from linear regression to random walks and far beyond—that can turn anyone into a genius. At the core of the book is Page's "many-model paradigm," which shows the reader how to apply multiple models to organize the data, leading to wiser choices, more accurate predictions, and more robust designs. The Model Thinker provides a toolkit for business people, students, scientists, pollsters, and bloggers to make them better, clearer thinkers, able to leverage data and information to their advantage.

Hands-On Machine Learning with Scikit-Learn and TensorFlow


Aurélien Géron - 2017
    Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details