Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Web Analytics 2.0: The Art of Online Accountability & Science of Customer Centricity [With CDROM]


Avinash Kaushik - 2009
    "Web Analytics 2.0" presents a new framework that will permanently change how you think about analytics. It provides specific recommendations for creating an actionable strategy, applying analytical techniques correctly, solving challenges such as measuring social media and multichannel campaigns, achieving optimal success by leveraging experimentation, and employing tactics for truly listening to your customers. The book will help your organization become more data driven while you become a super analysis ninja Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.

The Art of Data Science: A Guide for Anyone Who Works with Data


Roger D. Peng - 2015
    The authors have extensive experience both managing data analysts and conducting their own data analyses, and have carefully observed what produces coherent results and what fails to produce useful insights into data. This book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science.

Scrum: The Art of Doing Twice the Work in Half the Time


Jeff Sutherland - 2014
    It already drives most of the world’s top technology companies. And now it’s starting to spread to every domain where leaders wrestle with complex projects. If you’ve ever been startled by how fast the world is changing, Scrum is one of the reasons why. Productivity gains of as much as 1200% have been recorded, and there’s no more lucid – or compelling – explainer of Scrum and its bright promise than Jeff Sutherland, the man who put together the first Scrum team more than twenty years ago. The thorny problem Jeff began tackling back then boils down to this: people are spectacularly bad at doing things with agility and efficiency. Best laid plans go up in smoke. Teams often work at cross purposes to each other. And when the pressure rises, unhappiness soars. Drawing on his experience as a West Point-educated fighter pilot, biometrics expert, early innovator of ATM technology, and V.P. of engineering or CTO at eleven different technology companies, Jeff began challenging those dysfunctional realities, looking for solutions that would have global impact. In this book you’ll journey to Scrum’s front lines where Jeff’s system of deep accountability, team interaction, and constant iterative improvement is, among other feats, bringing the FBI into the 21st century, perfecting the design of an affordable 140 mile per hour/100 mile per gallon car, helping NPR report fast-moving action in the Middle East, changing the way pharmacists interact with patients, reducing poverty in the Third World, and even helping people plan their weddings and accomplish weekend chores. Woven with insights from martial arts, judicial decision making, advanced aerial combat, robotics, and many other disciplines, Scrum is consistently riveting. But the most important reason to read this book is that it may just help you achieve what others consider unachievable – whether it be inventing a trailblazing technology, devising a new system of education, pioneering a way to feed the hungry, or, closer to home, a building a foundation for your family to thrive and prosper.

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World


Bruce Schneier - 2015
    Your online and in-store purchasing patterns are recorded, and reveal if you're unemployed, sick, or pregnant. Your e-mails and texts expose your intimate and casual friends. Google knows what you’re thinking because it saves your private searches. Facebook can determine your sexual orientation without you ever mentioning it.The powers that surveil us do more than simply store this information. Corporations use surveillance to manipulate not only the news articles and advertisements we each see, but also the prices we’re offered. Governments use surveillance to discriminate, censor, chill free speech, and put people in danger worldwide. And both sides share this information with each other or, even worse, lose it to cybercriminals in huge data breaches.Much of this is voluntary: we cooperate with corporate surveillance because it promises us convenience, and we submit to government surveillance because it promises us protection. The result is a mass surveillance society of our own making. But have we given up more than we’ve gained? In Data and Goliath, security expert Bruce Schneier offers another path, one that values both security and privacy. He brings his bestseller up-to-date with a new preface covering the latest developments, and then shows us exactly what we can do to reform government surveillance programs, shake up surveillance-based business models, and protect our individual privacy. You'll never look at your phone, your computer, your credit cards, or even your car in the same way again.

Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Big Data: Using SMART Big Data, Analytics and Metrics To Make Better Decisions and Improve Performance


Bernard Marr - 2015
    We all need to know what it is and how it works - that much is obvious. But is a basic understanding of the theory enough to hold your own in strategy meetings? Probably. But what will set you apart from the rest is actually knowing how to USE big data to get solid, real-world business results - and putting that in place to improve performance. Big Data will give you a clear understanding, blueprint, and step-by-step approach to building your own big data strategy. This is a well-needed practical introduction to actually putting the topic into practice. Illustrated with numerous real-world examples from a cross section of companies and organisations, Big Data will take you through the five steps of the SMART model: Start with Strategy, Measure Metrics and Data, Apply Analytics, Report Results, Transform. Discusses how companies need to clearly define what it is they need to know Outlines how companies can collect relevant data and measure the metrics that will help them answer their most important business questions Addresses how the results of big data analytics can be visualised and communicated to ensure key decisions-makers understand them Includes many high-profile case studies from the author's work with some of the world's best known brands

Prediction Machines: The Simple Economics of Artificial Intelligence


Ajay Agrawal - 2018
    But facing the sea change that AI will bring can be paralyzing. How should companies set strategies, governments design policies, and people plan their lives for a world so different from what we know? In the face of such uncertainty, many analysts either cower in fear or predict an impossibly sunny future.But in Prediction Machines, three eminent economists recast the rise of AI as a drop in the cost of prediction. With this single, masterful stroke, they lift the curtain on the AI-is-magic hype and show how basic tools from economics provide clarity about the AI revolution and a basis for action by CEOs, managers, policy makers, investors, and entrepreneurs.When AI is framed as cheap prediction, its extraordinary potential becomes clear: Prediction is at the heart of making decisions under uncertainty. Our businesses and personal lives are riddled with such decisions. Prediction tools increase productivity--operating machines, handling documents, communicating with customers. Uncertainty constrains strategy. Better prediction creates opportunities for new business structures and strategies to compete. Penetrating, fun, and always insightful and practical, Prediction Machines follows its inescapable logic to explain how to navigate the changes on the horizon. The impact of AI will be profound, but the economic framework for understanding it is surprisingly simple.

Give and Take: A Revolutionary Approach to Success


Adam M. Grant - 2013
    But today, success is increasingly dependent on how we interact with others. It turns out that at work, most people operate as either takers, matchers, or givers. Whereas takers strive to get as much as possible from others and matchers aim to trade evenly, givers are the rare breed of people who contribute to others without expecting anything in return. Using his own pioneering research as Wharton's youngest tenured professor, Grant shows that these styles have a surprising impact on success. Although some givers get exploited and burn out, the rest achieve extraordinary results across a wide range of industries. Combining cutting-edge evidence with captivating stories, this landmark book shows how one of America's best networkers developed his connections, why the creative genius behind one of the most popular shows in television history toiled for years in anonymity, how a basketball executive responsible for multiple draft busts transformed his franchise into a winner, and how we could have anticipated Enron's demise four years before the company collapsed - without ever looking at a single number. Praised by bestselling authors such as Dan Pink, Tony Hsieh, Dan Ariely, Susan Cain, Dan Gilbert, Gretchen Rubin, Bob Sutton, David Allen, Robert Cialdini, and Seth Godin-as well as senior leaders from Google, McKinsey, Merck, Estee Lauder, Nike, and NASA - Give and Take highlights what effective networking, collaboration, influence, negotiation, and leadership skills have in common. This landmark book opens up an approach to success that has the power to transform not just individuals and groups, but entire organizations and communities.

Data Analysis Using SQL and Excel


Gordon S. Linoff - 2007
    This book helps you use SQL and Excel to extract business information from relational databases and use that data to define business dimensions, store transactions about customers, produce results, and more. Each chapter explains when and why to perform a particular type of business analysis in order to obtain useful results, how to design and perform the analysis using SQL and Excel, and what the results should look like.

Calling Bullshit: The Art of Skepticism in a Data-Driven World


Carl T. Bergstrom - 2020
    Now, two science professors give us the tools to dismantle misinformation and think clearly in a world of fake news and bad data.It's increasingly difficult to know what's true. Misinformation, disinformation, and fake news abound. Our media environment has become hyperpartisan. Science is conducted by press release. Startup culture elevates bullshit to high art. We are fairly well equipped to spot the sort of old-school bullshit that is based in fancy rhetoric and weasel words, but most of us don't feel qualified to challenge the avalanche of new-school bullshit presented in the language of math, science, or statistics. In Calling Bullshit, Professors Carl Bergstrom and Jevin West give us a set of powerful tools to cut through the most intimidating data.You don't need a lot of technical expertise to call out problems with data. Are the numbers or results too good or too dramatic to be true? Is the claim comparing like with like? Is it confirming your personal bias? Drawing on a deep well of expertise in statistics and computational biology, Bergstrom and West exuberantly unpack examples of selection bias and muddled data visualization, distinguish between correlation and causation, and examine the susceptibility of science to modern bullshit.We have always needed people who call bullshit when necessary, whether within a circle of friends, a community of scholars, or the citizenry of a nation. Now that bullshit has evolved, we need to relearn the art of skepticism.

Superforecasting: The Art and Science of Prediction


Philip E. Tetlock - 2015
    Unfortunately, people tend to be terrible forecasters. As Wharton professor Philip Tetlock showed in a landmark 2005 study, even experts’ predictions are only slightly better than chance. However, an important and underreported conclusion of that study was that some experts do have real foresight, and Tetlock has spent the past decade trying to figure out why. What makes some people so good? And can this talent be taught?   In Superforecasting, Tetlock and coauthor Dan Gardner offer a masterwork on prediction, drawing on decades of research and the results of a massive, government-funded forecasting tournament. The Good Judgment Project involves tens of thousands of ordinary people—including a Brooklyn filmmaker, a retired pipe installer, and a former ballroom dancer—who set out to forecast global events. Some of the volunteers have turned out to be astonishingly good. They’ve beaten other benchmarks, competitors, and prediction markets. They’ve even beaten the collective judgment of intelligence analysts with access to classified information. They are "superforecasters."   In this groundbreaking and accessible book, Tetlock and Gardner show us how we can learn from this elite group. Weaving together stories of forecasting successes (the raid on Osama bin Laden’s compound) and failures (the Bay of Pigs) and interviews with a range of high-level decision makers, from David Petraeus to Robert Rubin, they show that good forecasting doesn’t require powerful computers or arcane methods. It involves gathering evidence from a variety of sources, thinking probabilistically, working in teams, keeping score, and being willing to admit error and change course. Superforecasting offers the first demonstrably effective way to improve our ability to predict the future—whether in business, finance, politics, international affairs, or daily life—and is destined to become a modern classic.

The Unwritten Laws of Business


W.J. King - 1944
    The Unwritten Laws of Business is such a book. Originally published over 60 years ago as The Unwritten Laws of Engineering, it has sold over 100,000 copies, despite the fact that it has never been available before to general readers. Fully revised for business readers today, here are but a few of the gems you’ll find in this little-known business classic: If you take care of your present job well, the future will take care of itself.The individual who says nothing is usually credited with having nothing to say.Whenever you are performing someone else’s function, you are probably neglecting your own.Martyrdom only rarely makes heroes, and in the business world, such heroes and martyrs often find themselves unemployed.Refreshingly free of the latest business fads and jargon, this is a book that is wise and insightful, capturing and distilling the timeless truths and principles that underlie management and business the world over.The little book with the big history.In the summer of 2005, Business 2.0 published a cover story on Raytheon CEO William Swanson’s self-published pamphlet, Swanson’s Unwritten Rules of Management. Lauded by such chief executives as Jack Welch and Warren Buffett, the booklet becamea quiet phenomenon. As it turned out, much of Swanson’s book drew from a classic of business literature that has been in print for more than sixty years. Now, in a new edition revised and updated for business readers today, we are reissuing the 1944 classic that inspired a number of Swanson’s “rules”: The Unwritten Laws of Business. Filled with sage advice and written in a spare, engaging style, The Unwritten Laws of Business offers insights on working with others, reporting to a boss, organizing a project, running a meeting, advancing your career, and more. Here’s just a sprinkling of the old-fashioned, yet surprisingly relevant, wisdom you’ll find in these pages:If you have no intention of listening to, considering, and perhaps using, someone’s opinion, don’t ask for it.Count any meeting a failure that does not end up with a definite understanding as to what’s going to be done, who’s going to do it, and when.The common belief that everyone can do anything if they just try hard enough is a formula for inefficiency at best and for complete failure at worst.It is natural enough to “look out for Number One first,” but when you do, your associates will be noticeably disinclined to look out for you.Whether you’re a corporate neophyte or seasoned manager, this charming book reveals everything you need to know about the “unwritten” laws of business.

Data Feminism


Catherine D’Ignazio - 2020
    It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In Data Feminism, Catherine D'Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought.Illustrating data feminism in action, D'Ignazio and Klein show how challenges to the male/female binary can help challenge other hierarchical (and empirically wrong) classification systems. They explain how, for example, an understanding of emotion can expand our ideas about effective data visualization, and how the concept of invisible labor can expose the significant human efforts required by our automated systems. And they show why the data never, ever “speak for themselves.”Data Feminism offers strategies for data scientists seeking to learn how feminism can help them work toward justice, and for feminists who want to focus their efforts on the growing field of data science. But Data Feminism is about much more than gender. It is about power, about who has it and who doesn't, and about how those differentials of power can be challenged and changed.