What Is Data Science?


Mike Loukides - 2011
    Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die


Eric Siegel - 2013
    Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.

The Mathematical Theory of Communication


Claude Shannon - 1949
    Republished in book form shortly thereafter, it has since gone through four hardcover and sixteen paperback printings. It is a revolutionary work, astounding in its foresight and contemporaneity. The University of Illinois Press is pleased and honored to issue this commemorative reprinting of a classic.

The Book of Why: The New Science of Cause and Effect


Judea Pearl - 2018
    Today, that taboo is dead. The causal revolution, instigated by Judea Pearl and his colleagues, has cut through a century of confusion and established causality -- the study of cause and effect -- on a firm scientific basis. His work explains how we can know easy things, like whether it was rain or a sprinkler that made a sidewalk wet; and how to answer hard questions, like whether a drug cured an illness. Pearl's work enables us to know not just whether one thing causes another: it lets us explore the world that is and the worlds that could have been. It shows us the essence of human thought and key to artificial intelligence. Anyone who wants to understand either needs The Book of Why.

Normal Accidents: Living with High-Risk Technologies


Charles Perrow - 1984
    Charles Perrow argues that the conventional engineering approach to ensuring safety--building in more warnings and safeguards--fails because systems complexity makes failures inevitable. He asserts that typical precautions, by adding to complexity, may help create new categories of accidents. (At Chernobyl, tests of a new safety system helped produce the meltdown and subsequent fire.) By recognizing two dimensions of risk--complex versus linear interactions, and tight versus loose coupling--this book provides a powerful framework for analyzing risks and the organizations that insist we run them.The first edition fulfilled one reviewer's prediction that it may mark the beginning of accident research. In the new afterword to this edition Perrow reviews the extensive work on the major accidents of the last fifteen years, including Bhopal, Chernobyl, and the Challenger disaster. The new postscript probes what the author considers to be the quintessential 'Normal Accident' of our time: the Y2K computer problem.

Data-ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else


Steve Lohr - 2015
    Today, Data is the vital raw material of the information economy. The explosive abundance of this digital asset, more than doubling every two years, is creating a new world of opportunity and challenge.Data-ism is about this next phase, in which vast, Internet-scale data sets are used for discovery and prediction in virtually every field. It is a journey across this emerging world with people, illuminating narrative examples, and insights. It shows that, if exploited, this new revolution will change the way decisions are made—relying more on data and analysis, and less on intuition and experience—and transform the nature of leadership and management.Lohr explains how individuals and institutions will need to exploit, protect, and manage their data to stay competitive in the coming years. Filled with rich examples and anecdotes of the various ways in which the rise of Big Data is affecting everyday life it raises provocative questions about policy and practice that have wide implications for all of our lives.

Overconnected: The Promise and Threat of the Internet


William H. Davidow - 2010
    pThe benefits of our recently arrived-at state of connectivity have been myriad from the ease with which it has been possible to buy a new house to the convenience of borrowing and investing money profitably. But the luxuries of the connected age have taken on a momentum all of their own. By counter-intuitively anatomizing how being overconnected tends to create systems of positive feedback that have largely negative consequences, Davidow explains everything from the recent Subprime mortgage crisis to the meltdown of Iceland, from the loss of people s privacy to the spectacular fall of the stock market. All because we were so miraculously wired together.pExplaining how such symptoms of Internet connection as unforeseeable accidents and how thought contagions acted to accelerate the downfall and make us permanently vulnerable to catastrophe, Davidow places our recent experience in historical perspective and offers a set of practical steps to minimize similar disasters in the future.pWilliam Davidow is a successful Silicon Valley venture capitalist, philanthropist, and author, and as a senior vice-president of Intel Corporation, he was responsible for the design of the Intel microprocessor chip. He has written three previous books Marketing High Technology (The Free Press, 1986) and Total Customer Service (Harper, 1989), both with Bro Uttal, and The Virtual Corporation" (Harper, 1992), with Michael Malone as well as columns for Forbes and numerous op-ed pieces. He graduated from Dartmouth College, has a masters degree from the California Institute of Technology, and a Ph.D. from Stanford University. He serves on the boards of Cal Tech, the California Nature Conservancy, and the Stanford Institute for Economic Policy Research.

New Dark Age: Technology and the End of the Future


James Bridle - 2018
    Underlying this trend is a single idea: the belief that our existence is understandable through computation, and more data is enough to help us build a better world.   In actual fact, we are lost in a sea of information, increasingly divided by fundamentalism, simplistic narratives, conspiracy theories, and post-factual politics. Meanwhile, those in power use our lack of understanding to further their own interests. Despite the accessibility of information, we’re living in a new Dark Age.   From rogue financial systems to shopping algorithms, from artificial intelligence to state secrecy, we no longer understand how our world is governed or presented to us. The media is filled with unverifiable speculation, much of it generated by anonymous software, while companies dominate their employees through surveillance and the threat of automation.   In his brilliant new work, leading artist and writer James Bridle excavates the limits of technology and how it aids our understanding of the world. Surveying the history of art, technology, and information systems, he explores the dark clouds that gather over our dreams of the digital sublime.

Social Network Analysis: A Handbook


John P. Scott - 1991
    It gives a clear and authoritative guide to the general framework of network analysis, explaining the basic concepts, technical measures and reviewing the available computer programs.The book outlines both the theoretical basis of network analysis and the key techniques for using it as a research tool. Building upon definitions of points, lines and paths, John Scott demonstrates their use in clarifying such measures as density, fragmentation and centralization. He identifies the various cliques, components and circles into which networks are formed, and outlines

Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech


Sara Wachter-Boettcher - 2017
    But few of us realize just how many oversights, biases, and downright ethical nightmares are baked inside the tech products we use every day. It’s time we change that.In Technically Wrong, Sara Wachter-Boettcher demystifies the tech industry, leaving those of us on the other side of the screen better prepared to make informed choices about the services we use—and to demand more from the companies behind them.

Piracy: The Intellectual Property Wars from Gutenberg to Gates


Adrian Johns - 2010
    The Motion Picture Association of America, for instance, claimed that in 2005 the film industry lost $2.3 billion in revenue to piracy online. But here Adrian Johns shows that piracy has a much longer and more vital history than we have realized—one that has been largely forgotten and is little understood. Piracy explores the intellectual property wars from the advent of print culture in the fifteenth century to the reign of the Internet in the twenty-first. Brimming with broader implications for today’s debates over open access, fair use, free culture, and the like, Johns’s book ultimately argues that piracy has always stood at the center of our attempts to reconcile creativity and commerce—and that piracy has been an engine of social, technological, and intellectual innovations as often as it has been their adversary. From Cervantes to Sonny Bono, from Maria Callas to Microsoft, from Grub Street to Google, no chapter in the story of piracy evades Johns’s graceful analysis in what will be the definitive history of the subject for years to come.

The Power of Experiments: Decision Making in a Data-Driven World


Michael Luca - 2020
    Once an esoteric tool for academic research, the randomized controlled trial has gone mainstream. No tech company worth its salt (or its share price) would dare make major changes to its platform without first running experiments to understand how they would influence user behavior. In this book, Michael Luca and Max Bazerman explain the importance of experiments for decision making in a data-driven world.Luca and Bazerman describe the central role experiments play in the tech sector, drawing lessons and best practices from the experiences of such companies as StubHub, Alibaba, and Uber. Successful experiments can save companies money--eBay, for example, discovered how to cut $50 million from its yearly advertising budget--or bring to light something previously ignored, as when Airbnb was forced to confront rampant discrimination by its hosts. Moving beyond tech, Luca and Bazerman consider experimenting for the social good--different ways that govenments are using experiments to influence or "nudge" behavior ranging from voter apathy to school absenteeism. Experiments, they argue, are part of any leader's toolkit. With this book, readers can become part of "the experimental revolution."

The Coding Manual for Qualitative Researchers


Johnny Saldana - 2009
    In total, 29 different approaches to coding are covered, ranging in complexity from beginner to advanced level and covering the full range of types of qualitative data from interview transcripts to field notes. For each approach profiled, Johnny Saldana discusses the method's origins in the professional literature, a description of the method, recommendations for practical applications, and a clearly illustrated example.Also included in the book is an introduction to how codes and coding initiate qualitative data analysis, their applications with qualitative data analysis software, the writing of supplemental analytic memos, and recommendations for how to best use the manual for particular studies.

D is for Digital: What a well-informed person should know about computers and communications


Brian W. Kernighan - 2011
    

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement


Eric Redmond - 2012
    As a modern application developer you need to understand the emerging field of data management, both RDBMS and NoSQL. Seven Databases in Seven Weeks takes you on a tour of some of the hottest open source databases today. In the tradition of Bruce A. Tate's Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core each technology. Redis, Neo4J, CouchDB, MongoDB, HBase, Riak and Postgres. With each database, you'll tackle a real-world data problem that highlights the concepts and features that make it shine. You'll explore the five data models employed by these databases-relational, key/value, columnar, document and graph-and which kinds of problems are best suited to each. You'll learn how MongoDB and CouchDB are strikingly different, and discover the Dynamo heritage at the heart of Riak. Make your applications faster with Redis and more connected with Neo4J. Use MapReduce to solve Big Data problems. Build clusters of servers using scalable services like Amazon's Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that's more than the sum of its parts, or find one that meets all your needs at once.Seven Databases in Seven Weeks will take you on a deep dive into each of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.What You Need: To get the most of of this book you'll have to follow along, and that means you'll need a *nix shell (Mac OSX or Linux preferred, Windows users will need Cygwin), and Java 6 (or greater) and Ruby 1.8.7 (or greater). Each chapter will list the downloads required for that database.