Naked Statistics: Stripping the Dread from the Data


Charles Wheelan - 2012
    How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.

The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios


Steve Wexler - 2017
    It's great to have theory and evidenced-based research at your disposal, but what will you do when somebody asks you to make your dashboard 'cooler' by adding packed bubbles and donut charts?The expert authors have a combined 30-plus years of hands-on experience helping people in hundreds of organizations build effective visualizations. They have fought many 'best practices' battles and having endured bring an uncommon empathy to help you, the reader of this book, survive and thrive in the data visualization world.A well-designed dashboard can point out risks, opportunities, and more; but common challenges and misconceptions can make your dashboard useless at best, and misleading at worst. The Big Book of Dashboards gives you the tools, guidance, and models you need to produce great dashboards that inform, enlighten, and engage.

Beautiful Data: The Stories Behind Elegant Data Solutions (Theory In Practice, #31)


Toby Segaran - 2009
    Join 39 contributors as they explain how they developed simple and elegant solutions on projects ranging from the Mars lander to a Radiohead video.With Beautiful Data, you will: Explore the opportunities and challenges involved in working with the vast number of datasets made available by the Web Learn how to visualize trends in urban crime, using maps and data mashups Discover the challenges of designing a data processing system that works within the constraints of space travel Learn how crowdsourcing and transparency have combined to advance the state of drug research Understand how new data can automatically trigger alerts when it matches or overlaps pre-existing data Learn about the massive infrastructure required to create, capture, and process DNA data That's only small sample of what you'll find in Beautiful Data. For anyone who handles data, this is a truly fascinating book. Contributors include:Nathan Yau Jonathan Follett and Matt Holm J.M. Hughes Raghu Ramakrishnan, Brian Cooper, and Utkarsh Srivastava Jeff Hammerbacher Jason Dykes and Jo Wood Jeff Jonas and Lisa Sokol Jud Valeski Alon Halevy and Jayant Madhavan Aaron Koblin with Valdean Klump Michal Migurski Jeff Heer Coco Krumme Peter Norvig Matt Wood and Ben Blackburne Jean-Claude Bradley, Rajarshi Guha, Andrew Lang, Pierre Lindenbaum, Cameron Neylon, Antony Williams, and Egon Willighagen Lukas Biewald and Brendan O'Connor Hadley Wickham, Deborah Swayne, and David Poole Andrew Gelman, Jonathan P. Kastellec, and Yair Ghitza Toby Segaran

Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures


Claus O. Wilke - 2019
    But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.Explore the basic concepts of color as a tool to highlight, distinguish, or represent a valueUnderstand the importance of redundant coding to ensure you provide key information in multiple waysUse the book's visualizations directory, a graphical guide to commonly used types of data visualizationsGet extensive examples of good and bad figuresLearn how to use figures in a document or report and how employ them effectively to tell a compelling story

Semiology of Graphics


Jacques Bertin - 1967
    Founded on Jacques Bertin’s practical experience as a cartographer, Part One of this work is an unprecedented attempt to synthesize principles of graphic communication with the logic of standard rules applied to writing and topography. Part Two brings Bertin’s theory to life, presenting a close study of graphic techniques including shape, orientation, color, texture, volume, and size in an array of more than 1,000 maps and diagrams.

Better Presentations: A Guide for Scholars, Researchers, and Wonks


Jonathan Schwabish - 2016
    Most of us approach this task by converting a written document into slides, but the result is often a text-heavy presentation saddled with bullet points, stock images, and graphs too complex for an audience to decipher--much less understand. Presenting is fundamentally different from writing, and with only a little more time, a little more effort, and a little more planning, you can communicate your work with force and clarity.Designed for presenters of scholarly or data-intensive content, "Better Presentations "details essential strategies for developing clear, sophisticated, and visually captivating presentations. Following three core principles--visualize, unify, and focus--"Better Presentations" describes how to visualize data effectively, find and use images appropriately, choose sensible fonts and colors, edit text for powerful delivery, and restructure a written argument for maximum engagement and persuasion. With a range of clear examples for what to do (and what not to do), the practical package offered in" Better Presentations" shares the best techniques to display work and the best tactics for winning over audiences. It pushes presenters past the frustration and intimidation of the process to more effective, memorable, and persuasive presentations.

Data Science for Business: What you need to know about data mining and data-analytic thinking


Foster Provost - 2013
    This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

The Signal and the Noise: Why So Many Predictions Fail—But Some Don't


Nate Silver - 2012
    He solidified his standing as the nation's foremost political forecaster with his near perfect prediction of the 2012 election. Silver is the founder and editor in chief of FiveThirtyEight.com. Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the "prediction paradox": The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good-or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary-and dangerous-science.Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.

Numbersense: How to Use Big Data to Your Advantage


Kaiser Fung - 2013
    Virtually every choice we make hinges on how someone generates data . . . and how someone else interprets it--whether we realize it or not.Where do you send your child for the best education? Big Data. Which airline should you choose to ensure a timely arrival? Big Data. Who will you vote for in the next election? Big Data.The problem is, the more data we have, the more difficult it is to interpret it. From world leaders to average citizens, everyone is prone to making critical decisions based on poor data interpretations.In Numbersense, expert statistician Kaiser Fung explains when you should accept the conclusions of the Big Data experts--and when you should say, Wait . . . what? He delves deeply into a wide range of topics, offering the answers to important questions, such as:How does the college ranking system really work?Can an obesity measure solve America's biggest healthcare crisis?Should you trust current unemployment data issued by the government?How do you improve your fantasy sports team?Should you worry about businesses that track your data?Don't take for granted statements made in the media, by our leaders, or even by your best friend. We're on information overload today, and there's a lot of bad information out there.Numbersense gives you the insight into how Big Data interpretation works--and how it too often doesn't work. You won't come away with the skills of a professional statistician. But you will have a keen understanding of the data traps even the best statisticians can fall into, and you'll trust the mental alarm that goes off in your head when something just doesn't seem to add up.Praise for NumbersenseNumbersense correctly puts the emphasis not on the size of big data, but on the analysis of it. Lots of fun stories, plenty of lessons learned--in short, a great way to acquire your own sense of numbers!Thomas H. Davenport, coauthor of Competing on Analytics and President's Distinguished Professor of IT and Management, Babson CollegeKaiser's accessible business book will blow your mind like no other. You'll be smarter, and you won't even realize it. Buy. It. Now.Avinash Kaushik, Digital Marketing Evangelist, Google, and author, Web Analytics 2.0Each story in Numbersense goes deep into what you have to think about before you trust the numbers. Kaiser Fung ably demonstrates that it takes skill and resourcefulness to make the numbers confess their meaning.John Sall, Executive Vice President, SAS InstituteKaiser Fung breaks the bad news--a ton more data is no panacea--but then has got your back, revealing the pitfalls of analysis with stimulating stories from the front lines of business, politics, health care, government, and education. The remedy isn't an advanced degree, nor is it common sense. You need Numbersense.Eric Siegel, founder, Predictive Analytics World, and author, Predictive AnalyticsI laughed my way through this superb-useful-fun book and learned and relearned a lot. Highly recommended! Tom Peters, author of In Search of Excellence

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data


Hadley Wickham - 2016
    This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way. You’ll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results

Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy


Cathy O'Neil - 2016
    Increasingly, the decisions that affect our lives--where we go to school, whether we can get a job or a loan, how much we pay for health insurance--are being made not by humans, but by machines. In theory, this should lead to greater fairness: Everyone is judged according to the same rules.But as mathematician and data scientist Cathy O'Neil reveals, the mathematical models being used today are unregulated and uncontestable, even when they're wrong. Most troubling, they reinforce discrimination--propping up the lucky, punishing the downtrodden, and undermining our democracy in the process.

The Data Detective: Ten Easy Rules to Make Sense of Statistics


Tim Harford - 2020
    That’s a mistake, Tim Harford says in The Data Detective. We shouldn’t be suspicious of statistics—we need to understand what they mean and how they can improve our lives: they are, at heart, human behavior seen through the prism of numbers and are often “the only way of grasping much of what is going on around us.” If we can toss aside our fears and learn to approach them clearly—understanding how our own preconceptions lead us astray—statistics can point to ways we can live better and work smarter.As “perhaps the best popular economics writer in the world” (New Statesman), Tim Harford is an expert at taking complicated ideas and untangling them for millions of readers. In The Data Detective, he uses new research in science and psychology to set out ten strategies for using statistics to erase our biases and replace them with new ideas that use virtues like patience, curiosity, and good sense to better understand ourselves and the world. As a result, The Data Detective is a big-idea book about statistics and human behavior that is fresh, unexpected, and insightful.

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Design for Information: An Introduction to the Histories, Theories, and Best Practices Behind Effective Information Visualizations


Isabel Meirelles - 2013
    Design for Information critically examines other design solutions —current and historic— helping you gain a larger understanding of how to solve specific problems. This book is designed to help you foster the development of a repertoire of existing methods and concepts to help you overcome design problems. Learn the ins and outs of data visualization with this informative book that provides you with a series of current visualization case studies. The visualizations discussed are analyzed for their design principles and methods, giving you valuable critical and analytical tools to further develop your design process. The case study format of this book is perfect for discussing  the histories, theories and best practices in the field through real-world, effective visualizations. The selection represents a fraction of effective visualizations that we encounter in this burgeoning field, allowing you the opportunity to extend your study to other solutions in your specific field(s) of practice. This book is also helpful to students in other disciplines who are involved with visualizing information, such as those in the digital humanities and most of the sciences.