Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

The Art of Statistics: How to Learn from Data


David Spiegelhalter - 2019
      Statistics are everywhere, as integral to science as they are to business, and in the popular media hundreds of times a day. In this age of big data, a basic grasp of statistical literacy is more important than ever if we want to separate the fact from the fiction, the ostentatious embellishments from the raw evidence -- and even more so if we hope to participate in the future, rather than being simple bystanders. In The Art of Statistics, world-renowned statistician David Spiegelhalter shows readers how to derive knowledge from raw data by focusing on the concepts and connections behind the math. Drawing on real world examples to introduce complex issues, he shows us how statistics can help us determine the luckiest passenger on the Titanic, whether a notorious serial killer could have been caught earlier, and if screening for ovarian cancer is beneficial. The Art of Statistics not only shows us how mathematicians have used statistical science to solve these problems -- it teaches us how we too can think like statisticians. We learn how to clarify our questions, assumptions, and expectations when approaching a problem, and -- perhaps even more importantly -- we learn how to responsibly interpret the answers we receive. Combining the incomparable insight of an expert with the playful enthusiasm of an aficionado, The Art of Statistics is the definitive guide to stats that every modern person needs.

Social Physics: How Good Ideas Spread— The Lessons from a New Science


Alex Pentland - 2014
    Over years of groundbreaking experiments, he has distilled remarkable discoveries significant enough to become the bedrock of a whole new scientific field: social physics. Humans have more in common with bees than we like to admit: We’re social creatures first and foremost. Our most important habits of action—and most basic notions of common sense—are wired into us through our coordination in social groups. Social physics is about idea flow, the way human social networks spread ideas and transform those ideas into behaviors. Thanks to the millions of digital bread crumbs people leave behind via smartphones, GPS devices, and the Internet, the amount of new information we have about human activity is truly profound. Until now, sociologists have depended on limited data sets and surveys that tell us how people say they think and behave, rather than what they actually do. As a result, we’ve been stuck with the same stale social structures—classes, markets—and a focus on individual actors, data snapshots, and steady states. Pentland shows that, in fact, humans respond much more powerfully to social incentives that involve rewarding others and strengthening the ties that bind than incentives that involve only their own economic self-interest. Pentland and his teams have found that they can study patterns of information exchange in a social network without any knowledge of the actual content of the information and predict with stunning accuracy how productive and effective that network is, whether it’s a business or an entire city. We can maximize a group’s collective intelligence to improve performance and use social incentives to create new organizations and guide them through disruptive change in a way that maximizes the good. At every level of interaction, from small groups to large cities, social networks can be tuned to increase exploration and engagement, thus vastly improving idea flow.  Social Physics will change the way we think about how we learn and how our social groups work—and can be made to work better, at every level of society. Pentland leads readers to the edge of the most important revolution in the study of social behavior in a generation, an entirely new way to look at life itself.

Quantum Computing Since Democritus


Scott Aaronson - 2013
    Full of insights, arguments and philosophical perspectives, the book covers an amazing array of topics. Beginning in antiquity with Democritus, it progresses through logic and set theory, computability and complexity theory, quantum computing, cryptography, the information content of quantum states and the interpretation of quantum mechanics. There are also extended discussions about time travel, Newcomb's Paradox, the anthropic principle and the views of Roger Penrose. Aaronson's informal style makes this fascinating book accessible to readers with scientific backgrounds, as well as students and researchers working in physics, computer science, mathematics and philosophy.

Being Wrong: Adventures in the Margin of Error


Kathryn Schulz - 2010
    Kathryn Schulz, editor of Grist magazine, argues that error is the fundamental human condition and should be celebrated as such. Guiding the reader through the history and psychology of error, from Socrates to Alan Greenspan, Being Wrong will change the way you perceive screw-ups, both of the mammoth and daily variety, forever.

How to Lie with Statistics


Darrell Huff - 1954
    Darrell Huff runs the gamut of every popularly used type of statistic, probes such things as the sample study, the tabulation method, the interview technique, or the way the results are derived from the figures, and points up the countless number of dodges which are used to fool rather than to inform.

The New Digital Age: Reshaping the Future of People, Nations and Business


Eric Schmidt - 2013
    And, the Director of Google Ideas, Jared Cohen, formerly an advisor to both Secretaries of State Condoleezza Rice and Hillary Clinton.Never before has the future been so vividly and transparently imagined. From technologies that will change lives (information systems that greatly increase productivity, safety and our quality of life, thought controlled motion technology that can revolutionize medical procedures, and near-perfect translation technology that allows us to have more diversified interactions) to our most important future considerations (curating our online identity and fighting those who would do harm with it) to the widespread political change that will transform the globe (through transformations in conflict, increasingly active and global citizenries, a new wave of cyber-terrorism and states operating simultaneously in the physical and virtual realms) to the ever present threats to our privacy and security, Schmidt and Cohen outline in great detail and scope all the promise and peril awaiting us in the coming decades.

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy


Sharon Bertsch McGrayne - 2011
    To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok.In the first-ever account of Bayes' rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years—at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information (Alan Turing's role in breaking Germany's Enigma code during World War II), and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes' rule is used everywhere from DNA de-coding to Homeland Security.Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time.

An Ugly Truth: Inside Facebook's Battle for Domination


Sheera Frenkel - 2021
     Once one of Silicon Valley’s greatest success stories, Facebook has been under constant fire for the past five years, roiled by controversies and crises. It turns out that while the tech giant was connecting the world, they were also mishandling users’ data, spreading fake news, and amplifying dangerous, polarizing hate speech. The company, many said, had simply lost its way. But the truth is far more complex. Leadership decisions enabled, and then attempted to deflect attention from, the crises. Time after time, Facebook’s engineers were instructed to create tools that encouraged people to spend as much time on the platform as possible, even as those same tools boosted inflammatory rhetoric, conspiracy theories, and partisan filter bubbles. And while consumers and lawmakers focused their outrage on privacy breaches and misinformation, Facebook solidified its role as the world’s most voracious data-mining machine, posting record profits, and shoring up its dominance via aggressive lobbying efforts. Drawing on their unrivaled sources, Sheera Frenkel and Cecilia Kang take readers inside the complex court politics, alliances and rivalries within the company to shine a light on the fatal cracks in the architecture of the tech behemoth. Their explosive, exclusive reporting led them to a shocking conclusion: The missteps of the last five years were not an anomaly but an inevitability—this is how Facebook was built to perform. In a period of great upheaval, growth has remained the one constant under the leadership of Mark Zuckerberg and Sheryl Sandberg. Both have been held up as archetypes of uniquely 21st century executives—he the tech “boy genius” turned billionaire, she the ultimate woman in business, an inspiration to millions through her books and speeches. But sealed off in tight circles of advisers and hobbled by their own ambition and hubris, each has stood by as their technology is coopted by hate-mongers, criminals and corrupt political regimes across the globe, with devastating consequences. In An Ugly Truth, they are at last held accountable.

Data Feminism


Catherine D’Ignazio - 2020
    It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In Data Feminism, Catherine D'Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought.Illustrating data feminism in action, D'Ignazio and Klein show how challenges to the male/female binary can help challenge other hierarchical (and empirically wrong) classification systems. They explain how, for example, an understanding of emotion can expand our ideas about effective data visualization, and how the concept of invisible labor can expose the significant human efforts required by our automated systems. And they show why the data never, ever “speak for themselves.”Data Feminism offers strategies for data scientists seeking to learn how feminism can help them work toward justice, and for feminists who want to focus their efforts on the growing field of data science. But Data Feminism is about much more than gender. It is about power, about who has it and who doesn't, and about how those differentials of power can be challenged and changed.

Statistics in Plain English


Timothy C. Urdan - 2001
    Each self-contained chapter consists of three sections. The first describes the statistic, including how it is used and what information it provides. The second section reviews how it works, how to calculate the formula, the strengths and weaknesses of the technique, and the conditions needed for its use. The final section provides examples that use and interpret the statistic. A glossary of terms and symbols is also included.New features in the second edition include:an interactive CD with PowerPoint presentations and problems for each chapter including an overview of the problem's solution; new chapters on basic research concepts including sampling, definitions of different types of variables, and basic research designs and one on nonparametric statistics; more graphs and more precise descriptions of each statistic; and a discussion of confidence intervals.This brief paperback is an ideal supplement for statistics, research methods, courses that use statistics, or as a reference tool to refresh one's memory about key concepts. The actual research examples are from psychology, education, and other social and behavioral sciences.Materials formerly available with this book on CD-ROM are now available for download from our website www.psypress.com. Go to the book's page and look for the 'Download' link in the right-hand column.

Turing's Cathedral: The Origins of the Digital Universe


George Dyson - 2012
    In Turing’s Cathedral, George Dyson focuses on a small group of men and women, led by John von Neumann at the Institute for Advanced Study in Princeton, New Jersey, who built one of the first computers to realize Alan Turing’s vision of a Universal Machine. Their work would break the distinction between numbers that mean things and numbers that do things—and our universe would never be the same. Using five kilobytes of memory (the amount allocated to displaying the cursor on a computer desktop of today), they achieved unprecedented success in both weather prediction and nuclear weapons design, while tackling, in their spare time, problems ranging from the evolution of viruses to the evolution of stars. Dyson’s account, both historic and prophetic, sheds important new light on how the digital universe exploded in the aftermath of World War II. The proliferation of both codes and machines was paralleled by two historic developments: the decoding of self-replicating sequences in biology and the invention of the hydrogen bomb. It’s no coincidence that the most destructive and the most constructive of human inventions appeared at exactly the same time.  How did code take over the world? In retracing how Alan Turing’s one-dimensional model became John von Neumann’s two-dimensional implementation, Turing’s Cathedral offers a series of provocative suggestions as to where the digital universe, now fully three-dimensional, may be heading next.

Innumeracy: Mathematical Illiteracy and Its Consequences


John Allen Paulos - 1988
    Dozens of examples in innumeracy show us how it affects not only personal economics and travel plans, but explains mis-chosen mates, inappropriate drug-testing, and the allure of pseudo-science.

Wikinomics: How Mass Collaboration Changes Everything


Don Tapscott - 2006
     Today, encyclopedias, jetliners, operating systems, mutual funds, and many other items are being created by teams numbering in the thousands or even millions. While some leaders fear the heaving growth of these massive online communities, Wikinomics proves this fear is folly. Smart firms can harness collective capability and genius to spur innovation, growth, and success. A brilliant guide to one of the most profound changes of our time, Wikinomics challenges our most deeply-rooted assumptions about business and will prove indispensable to anyone who wants to understand competitiveness in the twenty-first century. Based on a $9 million research project led by bestselling author Don Tapscott, Wikinomics shows how masses of people can participate in the economy like never before. They are creating TV news stories, sequencing the human genome, remixing their favorite music, designing software, finding a cure for disease, editing school texts, inventing new cosmetics, or even building motorcycles. You'll read about: • Rob McEwen, the Goldcorp, Inc. CEO who used open source tactics and an online competition to save his company and breathe new life into an old-fashioned industry. • Flickr, Second Life, YouTube, and other thriving online communities that transcend social networking to pioneer a new form of collaborative production. • Mature companies like Procter & Gamble that cultivate nimble, trust-based relationships with external collaborators to form vibrant business ecosystems. An important look into the future, Wikinomics will be your road map for doing business in the twenty-first century.