Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions


Michael G. Milton - 2009
    If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to:Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Human Compatible: Artificial Intelligence and the Problem of Control


Stuart Russell - 2019
    Conflict between humans and machines is seen as inevitable and its outcome all too predictable.In this groundbreaking book, distinguished AI researcher Stuart Russell argues that this scenario can be avoided, but only if we rethink AI from the ground up. Russell begins by exploring the idea of intelligence in humans and in machines. He describes the near-term benefits we can expect, from intelligent personal assistants to vastly accelerated scientific research, and outlines the AI breakthroughs that still have to happen before we reach superhuman AI. He also spells out the ways humans are already finding to misuse AI, from lethal autonomous weapons to viral sabotage.If the predicted breakthroughs occur and superhuman AI emerges, we will have created entities far more powerful than ourselves. How can we ensure they never, ever, have power over us? Russell suggests that we can rebuild AI on a new foundation, according to which machines are designed to be inherently uncertain about the human preferences they are required to satisfy. Such machines would be humble, altruistic, and committed to pursue our objectives, not theirs. This new foundation would allow us to create machines that are provably deferential and provably beneficial.In a 2014 editorial co-authored with Stephen Hawking, Russell wrote, "Success in creating AI would be the biggest event in human history. Unfortunately, it might also be the last." Solving the problem of control over AI is not just possible; it is the key that unlocks a future of unlimited promise.

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

The Signal and the Noise: Why So Many Predictions Fail—But Some Don't


Nate Silver - 2012
    He solidified his standing as the nation's foremost political forecaster with his near perfect prediction of the 2012 election. Silver is the founder and editor in chief of FiveThirtyEight.com. Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the "prediction paradox": The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good-or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary-and dangerous-science.Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.

Artificial Intelligence and Machine Learning for Business: A No-Nonsense Guide to Data Driven Technologies


Steven Finlay - 2021
    They are being applied across many industries to increase profits, reduce costs, save lives and improve customer experiences. Consequently, organizations that understand these tools and know how to use them are benefiting at the expense of their rivals.Artificial Intelligence and Machine Learning for Business cuts through the hype and technical jargon that is often associated with these subjects. It delivers a simple and concise introduction for managers and business people. The focus is on practical application and how to work with technical specialists (data scientists) to maximize the benefits of these technologies.This revised and fully updated edition contains several new sections and chapters, covering a broader set of topics than before, but retains the no-nonsense style of the original.Steven Finlay is a data scientist and author with more than 20 years’ experience of developing practical, business focused, analytical solutions. He holds a PhD in management science and is an honorary research fellow at Lancaster University in the UK.

Sensemaking: The Power of the Humanities in the Age of the Algorithm


Christian Madsbjerg - 2017
    Humans have become subservient to algorithms. Every day brings a new Moneyball fix--a math whiz who will crack open an industry with clean fact-based analysis rather than human intuition and experience. As a result, we have stopped thinking. Machines do it for us. Christian Madsbjerg argues that our fixation with data often masks stunning deficiencies, and the risks for humankind are enormous. Blind devotion to number crunching imperils our businesses, our educations, our governments, and our life savings. Too many companies have lost touch with the humanity of their customers, while marginalizing workers with liberal arts-based skills. Contrary to popular thinking, Madsbjerg shows how many of today's biggest success stories stem not from "quant" thinking but from deep, nuanced engagement with culture, language, and history. He calls his method sensemaking. In this landmark book, Madsbjerg lays out five principles for how business leaders, entrepreneurs, and individuals can use it to solve their thorniest problems. He profiles companies using sensemaking to connect with new customers, and takes readers inside the work process of sensemaking "connoisseurs" like investor George Soros, architect Bjarke Ingels, and others. Both practical and philosophical, Sensemaking is a powerful rejoinder to corporate groupthink and an indispensable resource for leaders and innovators who want to stand out from the pack.

Data Analysis Using Regression and Multilevel/Hierarchical Models


Andrew Gelman - 2006
    The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout. Author resource page: http: //www.stat.columbia.edu/ gelman/arm/

The Hundred-Page Machine Learning Book


Andriy Burkov - 2019
    During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.

Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

The Master Switch: The Rise and Fall of Information Empires


Tim Wu - 2010
    With all our media now traveling a single network, an unprecedented potential is building for centralized control over what Americans see and hear. Could history repeat itself with the next industrial consolidation? Could the Internet—the entire flow of American information—come to be ruled by one corporate leviathan in possession of “the master switch”? That is the big question of Tim Wu’s pathbreaking book.As Wu’s sweeping history shows, each of the new media of the twentieth century—radio, telephone, television, and film—was born free and open. Each invited unrestricted use and enterprising experiment until some would-be mogul battled his way to total domination. Here are stories of an uncommon will to power, the power over information: Adolph Zukor, who took a technology once used as commonly as YouTube is today and made it the exclusive prerogative of a kingdom called Hollywood . . . NBC’s founder, David Sarnoff, who, to save his broadcast empire from disruptive visionaries, bullied one inventor (of electronic television) into alcoholic despair and another (this one of FM radio, and his boyhood friend) into suicide . . . And foremost, Theodore Vail, founder of the Bell System, the greatest information empire of all time, and a capitalist whose faith in Soviet-style central planning set the course of every information industry thereafter.Explaining how invention begets industry and industry begets empire—a progress often blessed by government, typically with stifling consequences for free expression and technical innovation alike—Wu identifies a time-honored pattern in the maneuvers of today’s great information powers: Apple, Google, and an eerily resurgent AT&T. A battle royal looms for the Internet’s future, and with almost every aspect of our lives now dependent on that network, this is one war we dare not tune out.Part industrial exposé, part meditation on what freedom requires in the information age, The Master Switch is a stirring illumination of a drama that has played out over decades in the shadows of our national life and now culminates with terrifying implications for our future.

Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence


Kate Crawford - 2020
    It draws our attention away from the bright shiny objects of the new colonialism through elucidating the social, material and political dimensions of Artificial Intelligence.”—Geoffrey C. Bowker, University of California, Irvine What happens when artificial intelligence saturates political life and depletes the planet? How is AI shaping our understanding of ourselves and our societies? In this book Kate Crawford reveals how this planetary network is fueling a shift toward undemocratic governance and increased racial, gender, and economic inequality. Drawing on more than a decade of research, award‑winning science, and technology, Crawford reveals how AI is a technology of extraction: from the energy and minerals needed to build and sustain its infrastructure, to the exploited workers behind “automated” services, to the data AI collects from us.    Rather than taking a narrow focus on code and algorithms, Crawford offers us a political and a material perspective on what it takes to make artificial intelligence and where it goes wrong. While technical systems present a veneer of objectivity, they are always systems of power. This is an urgent account of what is at stake as technology companies use artificial intelligence to reshape the world.

Thinking with Data


Max Shron - 2014
    In this practical guide, data strategy consultant Max Shron shows you how to put the why before the how, through an often-overlooked set of analytical skills.Thinking with Data helps you learn techniques for turning data into knowledge you can use. You’ll learn a framework for defining your project, including the data you want to collect, and how you intend to approach, organize, and analyze the results. You’ll also learn patterns of reasoning that will help you unveil the real problem that needs to be solved.Learn a framework for scoping data projectsUnderstand how to pin down the details of an idea, receive feedback, and begin prototypingUse the tools of arguments to ask good questions, build projects in stages, and communicate resultsExplore data-specific patterns of reasoning and learn how to build more useful argumentsDelve into causal reasoning and learn how it permeates data workPut everything together, using extended examples to see the method of full problem thinking in action

The Social Life of Information


John Seely Brown - 2000
    John Seely Brown and Paul Duguid argue that the gap between digerati hype and end-user gloom is largely due to the "tunnel vision" that information-driven technologies breed. We've become so focused on where we think we ought to be--a place where technology empowers individuals and obliterates social organizations--that we often fail to see where we're really going.The Social Life of Information shows us how to look beyond our obsession with information and individuals to include the critical social networks of which these are always a part.

Python for Kids


Jason R. Briggs - 2012
    Jason Briggs, author of the popular online tutorial "Snake Wrangling for Kids," begins with the basics of how to install Python and write simple commands. In bite-sized chapters, he instructs readers on the essentials of Python, including how to use Python's extensive standard library, the difference between strings and lists, and using for-loops and while-loops. By the end of the book, readers have built a game and created drawings with Python's graphics library, Turtle. Each chapter closes with fun and relevant exercises that challenge the reader to put their newly acquired knowledge to the test.