Book picks similar to
On Being a Data Skeptic by Cathy O'Neil
nonfiction
non-fiction
data-science
technology
Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python
Scott Hartshorn - 2016
They are typically used to categorize something based on other data that you have. The purpose of this book is to help you understand how Random Forests work, as well as the different options that you have when using them to analyze a problem. Additionally, since Decision Trees are a fundamental part of Random Forests, this book explains how they work. This book is focused on understanding Random Forests at the conceptual level. Knowing how they work, why they work the way that they do, and what options are available to improve results. This book covers how Random Forests work in an intuitive way, and also explains the equations behind many of the functions, but it only has a small amount of actual code (in python). This book is focused on giving examples and providing analogies for the most fundamental aspects of how random forests and decision trees work. The reason is that those are easy to understand and they stick with you. There are also some really interesting aspects of random forests, such as information gain, feature importances, or out of bag error, that simply cannot be well covered without diving into the equations of how they work. For those the focus is providing the information in a straight forward and easy to understand way.
Natural Language Processing with Python
Steven Bird - 2009
With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligenceThis book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.
Learning From Data: A Short Course
Yaser S. Abu-Mostafa - 2012
Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.
Numbers Don't Lie: 71 Things You Need to Know About the World
Vaclav Smil - 2020
There's a wonderful mix of science, history and wit, all in bite-sized chapters on a broad range of topics.Urgent and essential, Numbers Don't Lie inspires readers to interrogate what they take to be true in these significant times. Smil is on a mission to make facts matter, because after all, numbers may not lie, but which truth do they convey?'The best book to read to better understand our world. Once in a while a book comes along that helps us see our planet more clearly. By showing us numbers about science, health, green technology and more, Smil's book does just that. It should be on every bookshelf!' Linda Yueh, author of The Great Economists'He is rigorously numeric, using data to illuminate every topic he writes about. The word "polymath" was invented to describe people like him' Bill Gates 'Important' Mark Zuckerberg, on Energy 'One of the world's foremost thinkers on development history and a master of statistical analysis . . . The nerd's nerd' Guardian 'There is perhaps no other academic who paints pictures with numbers like Smil' Guardian 'In a world of specialized intellectuals, Smil is an ambitious and astonishing polymath who swings for fences . . . They're among the most data-heavy books you'll find, with a remarkable way of framing basic facts' Wired 'He's a slayer of bullshit' David Keith, Gordon McKay Professor of Applied Physics & Professor of Public Policy, Harvard UniversityVaclav Smil is Distinguished Professor Emeritus at the University of Manitoba. He is the author of over forty books on topics including energy, environmental and population change, food production and nutrition, technical innovation, risk assessment and public policy. No other living scientist has had more books (on a wide variety of topics) reviewed in Nature. A Fellow of the Royal Society of Canada, in 2010 he was named by Foreign Policy as one of the Top 100 Global Thinkers. This is his first book for a more general readership.
Technopoly: The Surrender of Culture to Technology
Neil Postman - 1992
In this witty, often terrifying work of cultural criticism, the author of Amusing Ourselves to Death chronicles our transformation into a Technopoly: a society that no longer merely uses technology as a support system but instead is shaped by it--with radical consequences for the meanings of politics, art, education, intelligence, and truth.
The Mythical Man-Month: Essays on Software Engineering
Frederick P. Brooks Jr. - 1975
With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."
How Innovation Works: Serendipity, Energy and the Saving of Time
Matt Ridley - 2020
Forget short-term symptoms like Donald Trump and Brexit, it is innovation itself that explains them and that will itself shape the 21st century for good and ill. Yet innovation remains a mysterious process, poorly understood by policy makers and businessmen, hard to summon into existence to order, yet inevitable and inexorable when it does happen.Matt Ridley argues in this book that we need to change the way we think about innovation, to see it as an incremental, bottom-up, fortuitous process that happens to society as a direct result of the human habit of exchange, rather than an orderly, top-down process developing according to a plan. Innovation is crucially different from invention, because it is the turning of inventions into things of practical and affordable use to people. It speeds up in some sectors and slows down in others. It is always a collective, collaborative phenomenon, not a matter of lonely genius. It is gradual, serendipitous, recombinant, inexorable, contagious, experimental and unpredictable. It happens mainly in just a few parts of the world at any one time. It still cannot be modelled properly by economists, but it can easily be discouraged by politicians. Far from there being too much innovation, we may be on the brink of an innovation famine.Ridley derives these and other lessons, not with abstract argument, but from telling the lively stories of scores of innovations, how they started and why they succeeded or in some cases failed. He goes back millions of years and leaps forward into the near future. Some of the innovation stories he tells are about steam engines, jet engines, search engines, airships, coffee, potatoes, vaping, vaccines, cuisine, antibiotics, mosquito nets, turbines, propellers, fertiliser, zero, computers, dogs, farming, fire, genetic engineering, gene editing, container shipping, railways, cars, safety rules, wheeled suitcases, mobile phones, corrugated iron, powered flight, chlorinated water, toilets, vacuum cleaners, shale gas, the telegraph, radio, social media, block chain, the sharing economy, artificial intelligence, fake bomb detectors, phantom games consoles, fraudulent blood tests, faddish diets, hyperloop tubes, herbicides, copyright and even – a biological innovation -- life itself.
Python Data Science Handbook: Tools and Techniques for Developers
Jake Vanderplas - 2016
Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms
The Wealth of Networks: How Social Production Transforms Markets and Freedom
Yochai Benkler - 2006
The phenomenon he describes as social production is reshaping markets, while at the same time offering new opportunities to enhance individual freedom, cultural diversity, political discourse, and justice. But these results are by no means inevitable: a systematic campaign to protect the entrenched industrial information economy of the last century threatens the promise of today’s emerging networked information environment.In this comprehensive social theory of the Internet and the networked information economy, Benkler describes how patterns of information, knowledge, and cultural production are changing—and shows that the way information and knowledge are made available can either limit or enlarge the ways people can create and express themselves. He describes the range of legal and policy choices that confront us and maintains that there is much to be gained—or lost—by the decisions we make today.
The Algorithm Design Manual
Steven S. Skiena - 1997
Drawing heavily on the author's own real-world experiences, the book stresses design and analysis. Coverage is divided into two parts, the first being a general guide to techniques for the design and analysis of computer algorithms. The second is a reference section, which includes a catalog of the 75 most important algorithmic problems. By browsing this catalog, readers can quickly identify what the problem they have encountered is called, what is known about it, and how they should proceed if they need to solve it. This book is ideal for the working professional who uses algorithms on a daily basis and has need for a handy reference. This work can also readily be used in an upper-division course or as a student reference guide. THE ALGORITHM DESIGN MANUAL comes with a CD-ROM that contains: * a complete hypertext version of the full printed book. * the source code and URLs for all cited implementations. * over 30 hours of audio lectures on the design and analysis of algorithms are provided, all keyed to on-line lecture notes.
Pattern Recognition and Machine Learning
Christopher M. Bishop - 2006
However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think
Hans Rosling - 2018
So wrong that a chimpanzee choosing answers at random will consistently outguess teachers, journalists, Nobel laureates, and investment bankers.In Factfulness, Professor of International Health and global TED phenomenon Hans Rosling, together with his two long-time collaborators, Anna and Ola, offers a radical new explanation of why this happens. They reveal the ten instincts that distort our perspective—from our tendency to divide the world into two camps (usually some version of us and them) to the way we consume media (where fear rules) to how we perceive progress (believing that most things are getting worse).Our problem is that we don’t know what we don’t know, and even our guesses are informed by unconscious and predictable biases.It turns out that the world, for all its imperfections, is in a much better state than we might think. That doesn’t mean there aren’t real concerns. But when we worry about everything all the time instead of embracing a worldview based on facts, we can lose our ability to focus on the things that threaten us most.Inspiring and revelatory, filled with lively anecdotes and moving stories, Factfulness is an urgent and essential book that will change the way you see the world and empower you to respond to the crises and opportunities of the future.
Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
George Gilder - 2018
Gilder says or writes is ever delivered at anything less than the fullest philosophical decibel... Mr. Gilder sounds less like a tech guru than a poet, and his words tumble out in a romantic cascade." “Google’s algorithms assume the world’s future is nothing more than the next moment in a random process. George Gilder shows how deep this assumption goes, what motivates people to make it, and why it’s wrong: the future depends on human action.” — Peter Thiel, founder of PayPal and Palantir Technologies and author of Zero to One: Notes on Startups, or How to Build the Future The Age of Google, built on big data and machine intelligence, has been an awesome era. But it’s coming to an end. In Life after Google, George Gilder—the peerless visionary of technology and culture—explains why Silicon Valley is suffering a nervous breakdown and what to expect as the post-Google age dawns. Google’s astonishing ability to “search and sort” attracts the entire world to its search engine and countless other goodies—videos, maps, email, calendars….And everything it offers is free, or so it seems. Instead of paying directly, users submit to advertising. The system of “aggregate and advertise” works—for a while—if you control an empire of data centers, but a market without prices strangles entrepreneurship and turns the Internet into a wasteland of ads. The crisis is not just economic. Even as advances in artificial intelligence induce delusions of omnipotence and transcendence, Silicon Valley has pretty much given up on security. The Internet firewalls supposedly protecting all those passwords and personal information have proved hopelessly permeable. The crisis cannot be solved within the current computer and network architecture. The future lies with the “cryptocosm”—the new architecture of the blockchain and its derivatives. Enabling cryptocurrencies such as bitcoin and ether, NEO and Hashgraph, it will provide the Internet a secure global payments system, ending the aggregate-and-advertise Age of Google. Silicon Valley, long dominated by a few giants, faces a “great unbundling,” which will disperse computer power and commerce and transform the economy and the Internet. Life after Google is almost here. For fans of "Wealth and Poverty," "Knowledge and Power," and "The Scandal of Money."
Cool Infographics: Effective Communication with Data Visualization and Design
Randy Krum - 2013
This innovative book presents the design process and the best software tools for creating infographics that communicate. Including a special section on how to construct the increasingly popular infographic resume, the book offers graphic designers, marketers, and business professionals vital information on the most effective ways to present data.Explains why infographics and data visualizations work Shares the tools and techniques for creating great infographics Covers online infographics used for marketing, including social media and search engine optimization (SEO) Shows how to market your skills with a visual, infographic resume Explores the many internal business uses of infographics, including board meeting presentations, annual reports, consumer research statistics, marketing strategies, business plans, and visual explanations of products and services to your customers With Cool Infographics, you'll learn to create infographics to successfully reach your target audience and tell clear stories with your data.
Humble Pi: A Comedy of Maths Errors
Matt Parker - 2019
Most of the time this math works quietly behind the scenes . . . until it doesn't. All sorts of seemingly innocuous mathematical mistakes can have significant consequences.Math is easy to ignore until a misplaced decimal point upends the stock market, a unit conversion error causes a plane to crash, or someone divides by zero and stalls a battleship in the middle of the ocean.Exploring and explaining a litany of glitches, near misses, and mathematical mishaps involving the internet, big data, elections, street signs, lotteries, the Roman Empire, and an Olympic team, Matt Parker uncovers the bizarre ways math trips us up, and what this reveals about its essential place in our world. Getting it wrong has never been more fun.