Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

Lean Analytics: Use Data to Build a Better Startup Faster


Alistair Croll - 2013
    Lean Analytics steers you in the right direction.This book shows you how to validate your initial idea, find the right customers, decide what to build, how to monetize your business, and how to spread the word. Packed with more than thirty case studies and insights from over a hundred business experts, Lean Analytics provides you with hard-won, real-world information no entrepreneur can afford to go without.Understand Lean Startup, analytics fundamentals, and the data-driven mindsetLook at six sample business models and how they map to new ventures of all sizesFind the One Metric That Matters to youLearn how to draw a line in the sand, so you’ll know it’s time to move forwardApply Lean Analytics principles to large enterprises and established products

The Tyranny of Metrics


Jerry Z. Muller - 2017
    But in our zeal to instill the evaluation process with scientific rigor, we've gone from measuring performance to fixating on measuring itself. The result is a tyranny of metrics that threatens the quality of our lives and most important institutions. In this timely and powerful book, Jerry Muller uncovers the damage our obsession with metrics is causing--and shows how we can begin to fix the problem.Filled with examples from education, medicine, business and finance, government, the police and military, and philanthropy and foreign aid, this brief and accessible book explains why the seemingly irresistible pressure to quantify performance distorts and distracts, whether by encouraging "gaming the stats" or "teaching to the test." That's because what can and does get measured is not always worth measuring, may not be what we really want to know, and may draw effort away from the things we care about. Along the way, we learn why paying for measured performance doesn't work, why surgical scorecards may increase deaths, and much more. But metrics can be good when used as a complement to--rather than a replacement for--judgment based on personal experience, and Muller also gives examples of when metrics have been beneficial.Complete with a checklist of when and how to use metrics, The Tyranny of Metrics is an essential corrective to a rarely questioned trend that increasingly affects us all.

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy


Sharon Bertsch McGrayne - 2011
    To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok.In the first-ever account of Bayes' rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years—at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information (Alan Turing's role in breaking Germany's Enigma code during World War II), and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes' rule is used everywhere from DNA de-coding to Homeland Security.Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time.

The Big Nine: How the Tech Titans and Their Thinking Machines Could Warp Humanity


Amy Webb - 2019
    We like to think that we are in control of the future of "artificial" intelligence. The reality, though, is that we -- the everyday people whose data powers AI -- aren't actually in control of anything. When, for example, we speak with Alexa, we contribute that data to a system we can't see and have no input into -- one largely free from regulation or oversight. The big nine corporations -- Amazon, Google, Facebook, Tencent, Baidu, Alibaba, Microsoft, IBM and Apple--are the new gods of AI and are short-changing our futures to reap immediate financial gain. In this book, Amy Webb reveals the pervasive, invisible ways in which the foundations of AI -- the people working on the system, their motivations, the technology itself -- is broken. Within our lifetimes, AI will, by design, begin to behave unpredictably, thinking and acting in ways which defy human logic. The big nine corporations may be inadvertently building and enabling vast arrays of intelligent systems that don't share our motivations, desires, or hopes for the future of humanity. Much more than a passionate, human-centered call-to-arms, this book delivers a strategy for changing course, and provides a path for liberating us from algorithmic decision-makers and powerful corporations.

Reinforcement Learning: An Introduction


Richard S. Sutton - 1998
    Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

Code: The Hidden Language of Computer Hardware and Software


Charles Petzold - 1999
    And through CODE, we see how this ingenuity and our very human compulsion to communicate have driven the technological innovations of the past two centuries. Using everyday objects and familiar language systems such as Braille and Morse code, author Charles Petzold weaves an illuminating narrative for anyone who’s ever wondered about the secret inner life of computers and other smart machines. It’s a cleverly illustrated and eminently comprehensible story—and along the way, you’ll discover you’ve gained a real context for understanding today’s world of PCs, digital media, and the Internet. No matter what your level of technical savvy, CODE will charm you—and perhaps even awaken the technophile within.

Statistics Done Wrong: The Woefully Complete Guide


Alex Reinhart - 2013
    Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.

Introduction to Machine Learning with Python: A Guide for Data Scientists


Andreas C. Müller - 2015
    If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.You'll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Muller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.With this book, you'll learn:Fundamental concepts and applications of machine learningAdvantages and shortcomings of widely used machine learning algorithmsHow to represent data processed by machine learning, including which data aspects to focus onAdvanced methods for model evaluation and parameter tuningThe concept of pipelines for chaining models and encapsulating your workflowMethods for working with text data, including text-specific processing techniquesSuggestions for improving your machine learning and data science skills

WTF?: What's the Future and Why It's Up to Us


Tim O'Reilly - 2017
    In today’s economy, we have far too much dismay along with our amazement, and technology bears some of the blame. In this combination of memoir, business strategy guide, and call to action, Tim O'Reilly, Silicon Valley’s leading intellectual and the founder of O’Reilly Media, explores the upside and the potential downsides of today's WTF? technologies. What is the future when an increasing number of jobs can be performed by intelligent machines instead of people, or done only by people in partnership with those machines? What happens to our consumer based societies—to workers and to the companies that depend on their purchasing power? Is income inequality and unemployment an inevitable consequence of technological advancement, or are there paths to a better future? What will happen to business when technology-enabled networks and marketplaces are better at deploying talent than traditional companies? How should companies organize themselves to take advantage of these new tools? What’s the future of education when on-demand learning outperforms traditional institutions? How can individuals continue to adapt and retrain? Will the fundamental social safety nets of the developed world survive the transition, and if not, what will replace them? O'Reilly is "the man who can really can make a whole industry happen," according to Eric Schmidt, Executive Chairman of Alphabet (Google.) His genius over the past four decades has been to identify and to help shape our response to emerging technologies with world shaking potential—the World Wide Web, Open Source Software, Web 2.0, Open Government data, the Maker Movement, Big Data, and now AI. O’Reilly shares the techniques he's used at O’Reilly Media  to make sense of and predict past innovation waves and applies those same techniques to provide a framework for thinking about how today’s world-spanning platforms and networks, on-demand services, and artificial intelligence are changing the nature of business, education, government, financial markets, and the economy as a whole. He provides tools for understanding how all the parts of modern digital businesses work together to create marketplace advantage and customer value, and why ultimately, they cannot succeed unless their ecosystem succeeds along with them.The core of the book's call to action is an exhortation to businesses to DO MORE with technology rather than just using it to cut costs and enrich their shareholders. Robots are going to take our jobs, they say. O'Reilly replies, “Only if that’s what we ask them to do! Technology is the solution to human problems, and we won’t run out of work till we run out of problems." Entrepreneurs need to set their sights on how they can use big data, sensors, and AI to create amazing human experiences and the economy of the future, making us all richer in the same way the tools of the first industrial revolution did. Yes, technology can eliminate labor and make things cheaper, but at its best, we use it to do things that were previously unimaginable! What is our poverty of imagination? What are the entrepreneurial leaps that will allow us to use the technology of today to build a better future, not just a more efficient one? Whether technology brings the WTF? of wonder or the WTF? of dismay isn't inevitable. It's up to us!

The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future


Kevin Kelly - 2016
    In this fascinating, provocative new book, Kevin Kelly provides an optimistic road map for the future, showing how the coming changes in our lives—from virtual reality in the home to an on-demand economy to artificial intelligence embedded in everything we manufacture—can be understood as the result of a few long-term, accelerating forces. Kelly both describes these deep trends—flowing, screening, accessing, sharing, filtering, remixing, tracking, and questioning—and demonstrates how they overlap and are codependent on one another. These larger forces will completely revolutionize the way we buy, work, learn, and communicate with each other. By understanding and embracing them, says Kelly, it will be easier for us to remain on top of the coming wave of changes and to arrange our day-to-day relationships with technology in ways that bring forth maximum benefits. Kelly’s bright, hopeful book will be indispensable to anyone who seeks guidance on where their business, industry, or life is heading—what to invent, where to work, in what to invest, how to better reach customers, and what to begin to put into place—as this new world emerges.

Nudge: Improving Decisions About Health, Wealth, and Happiness


Richard H. Thaler - 2008
    Thaler, and Cass R. Sunstein: a revelatory look at how we make decisionsNew York Times bestsellerNamed a Best Book of the Year by The Economist and the Financial Times Every day we make choices—about what to buy or eat, about financial investments or our children’s health and education, even about the causes we champion or the planet itself. Unfortunately, we often choose poorly. Nudge is about how we make these choices and how we can make better ones. Using dozens of eye-opening examples and drawing on decades of behavioral science research, Nobel Prize winner Richard H. Thaler and Harvard Law School professor Cass R. Sunstein show that no choice is ever presented to us in a neutral way, and that we are all susceptible to biases that can lead us to make bad decisions. But by knowing how people think, we can use sensible “choice architecture” to nudge people toward the best decisions for ourselves, our families, and our society, without restricting our freedom of choice.

Whiplash: How to Survive Our Faster Future


Joichi Ito - 2016
    The world is more complex and volatile today than at any other time in our history. The tools of our modern existence are getting faster, cheaper, and smaller at an exponential rate, transforming every aspect of society, from business to culture and from the public sphere to our most private moments. The people who succeed will be the ones who learn to think differently. In Whiplash, Joi Ito and Jeff Howe distill that logic into nine organizing principles for navigating and surviving this tumultuous period: Emergence over AuthorityPull over PushCompasses over MapsRisk over SafetyDisobedience over CompliancePractice over TheoryDiversity over AbilityResilience over StrengthSystems over Objects Filled with incredible case studies and cutting-edge research and philosophies from the MIT Media Lab and beyond, Whiplash will help you adapt and succeed in this unpredictable world.

Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Python Data Science Handbook: Tools and Techniques for Developers


Jake Vanderplas - 2016
    Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms