Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Information Dashboard Design: The Effective Visual Communication of Data


Stephen Few - 2006
    Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.

Beautiful Visualization: Looking at Data through the Eyes of Experts


Julie Steele - 2010
    Think of the familiar map of the New York City subway system, or a diagram of the human brain. Successful visualizations are beautiful not only for their aesthetic design, but also for elegant layers of detail that efficiently generate insight and new understanding.This book examines the methods of two dozen visualization experts who approach their projects from a variety of perspectives -- as artists, designers, commentators, scientists, analysts, statisticians, and more. Together they demonstrate how visualization can help us make sense of the world.Explore the importance of storytelling with a simple visualization exerciseLearn how color conveys information that our brains recognize before we're fully aware of itDiscover how the books we buy and the people we associate with reveal clues to our deeper selvesRecognize a method to the madness of air travel with a visualization of civilian air trafficFind out how researchers investigate unknown phenomena, from initial sketches to published papers Contributors include:Nick Bilton, Michael E. Driscoll, Jonathan Feinberg, Danyel Fisher, Jessica Hagy, Gregor Hochmuth, Todd Holloway, Noah Iliinsky, Eddie Jabbour, Valdean Klump, Aaron Koblin, Robert Kosara, Valdis Krebs, JoAnn Kuchera-Morin et al., Andrew Odewahn, Adam Perer, Anders Persson, Maximilian Schich, Matthias Shapiro, Julie Steele, Moritz Stefaner, Jer Thorp, Fernanda Viegas, Martin Wattenberg, and Michael Young.

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die


Eric Siegel - 2013
    Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.

Pro Git


Scott Chacon - 2009
    It took the open source world by storm since its inception in 2005, and is used by small development shops and giants like Google, Red Hat, and IBM, and of course many open source projects.A book by Git experts to turn you into a Git expert. Introduces the world of distributed version control Shows how to build a Git development workflow.

Deep Learning with Python


François Chollet - 2017
    It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.

Hands-On Machine Learning with Scikit-Learn and TensorFlow


Aurélien Géron - 2017
    Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details

How Charts Lie: Getting Smarter about Visual Information


Alberto Cairo - 2019
    While such visualizations can better inform us, they can also deceive by displaying incomplete or inaccurate data, suggesting misleading patterns—or simply misinform us by being poorly designed, such as the confusing “eye of the storm” maps shown on TV every hurricane season.Many of us are ill equipped to interpret the visuals that politicians, journalists, advertisers, and even employers present each day, enabling bad actors to easily manipulate visuals to promote their own agendas. Public conversations are increasingly driven by numbers, and to make sense of them we must be able to decode and use visual information. By examining contemporary examples ranging from election-result infographics to global GDP maps and box-office record charts, How Charts Lie teaches us how to do just that.

Python Crash Course: A Hands-On, Project-Based Introduction to Programming


Eric Matthes - 2015
    You'll also learn how to make your programs interactive and how to test your code safely before adding it to a project. In the second half of the book, you'll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, data visualizations with Python's super-handy libraries, and a simple web app you can deploy online.As you work through Python Crash Course, you'll learn how to: Use powerful Python libraries and tools, including matplotlib, NumPy, and PygalMake 2D games that respond to keypresses and mouse clicks, and that grow more difficult as the game progressesWork with data to generate interactive visualizationsCreate and customize simple web apps and deploy them safely onlineDeal with mistakes and errors so you can solve your own programming problemsIf you've been thinking seriously about digging into programming, Python Crash Course will get you up to speed and have you writing real programs fast. Why wait any longer? Start your engines and code!

Data Feminism


Catherine D’Ignazio - 2020
    It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic. In Data Feminism, Catherine D'Ignazio and Lauren Klein present a new way of thinking about data science and data ethics—one that is informed by intersectional feminist thought.Illustrating data feminism in action, D'Ignazio and Klein show how challenges to the male/female binary can help challenge other hierarchical (and empirically wrong) classification systems. They explain how, for example, an understanding of emotion can expand our ideas about effective data visualization, and how the concept of invisible labor can expose the significant human efforts required by our automated systems. And they show why the data never, ever “speak for themselves.”Data Feminism offers strategies for data scientists seeking to learn how feminism can help them work toward justice, and for feminists who want to focus their efforts on the growing field of data science. But Data Feminism is about much more than gender. It is about power, about who has it and who doesn't, and about how those differentials of power can be challenged and changed.

Remix: Making Art and Commerce Thrive in the Hybrid Economy


Lawrence Lessig - 2008
    Lessig reveals the solutions to this impasse offered by a collaborative yet profitable “hybrid economy”.Lawrence Lessig, the reigning authority on intellectual property in the Internet age, spotlights the newest and possibly the most harmful culture war—a war waged against our kids and others who create and consume art. America’s copyright laws have ceased to perform their original, beneficial role: protecting artists’ creations while allowing them to build on previous creative works. In fact, our system now criminalizes those very actions. For many, new technologies have made it irresistible to flout these unreasonable and ultimately untenable laws. Some of today’s most talented artists are felons, and so are our kids, who see no reason why they shouldn’t do what their computers and the Web let them do, from burning a copyrighted CD for a friend to “biting” riffs from films, videos, songs, etc and making new art from them.Criminalizing our children and others is exactly what our society should not do, and Lessig shows how we can and must end this conflict—a war as ill conceived and unwinnable as the war on drugs. By embracing “read-write culture,” which allows its users to create art as readily as they consume it, we can ensure that creators get the support—artistic, commercial, and ethical—that they deserve and need. Indeed, we can already see glimmers of a new hybrid economy that combines the profit motives of traditional business with the “sharing economy” evident in such Web sites as Wikipedia and YouTube. The hybrid economy will become ever more prominent in every creative realm—from news to music—and Lessig shows how we can and should use it to benefit those who make and consume culture.Remix is an urgent, eloquent plea to end a war that harms our children and other intrepid creative users of new technologies. It also offers an inspiring vision of the post-war world where enormous opportunities await those who view art as a resource to be shared openly rather than a commodity to be hoarded.

The Filter Bubble: What the Internet is Hiding From You


Eli Pariser - 2011
    Instead of giving you the most broadly popular result, Google now tries to predict what you are most likely to click on. According to MoveOn.org board president Eli Pariser, Google's change in policy is symptomatic of the most significant shift to take place on the Web in recent years - the rise of personalization. In this groundbreaking investigation of the new hidden Web, Pariser uncovers how this growing trend threatens to control how we consume and share information as a society-and reveals what we can do about it.Though the phenomenon has gone largely undetected until now, personalized filters are sweeping the Web, creating individual universes of information for each of us. Facebook - the primary news source for an increasing number of Americans - prioritizes the links it believes will appeal to you so that if you are a liberal, you can expect to see only progressive links. Even an old-media bastion like "The Washington Post" devotes the top of its home page to a news feed with the links your Facebook friends are sharing. Behind the scenes a burgeoning industry of data companies is tracking your personal information to sell to advertisers, from your political leanings to the color you painted your living room to the hiking boots you just browsed on Zappos.In a personalized world, we will increasingly be typed and fed only news that is pleasant, familiar, and confirms our beliefs - and because these filters are invisible, we won't know what is being hidden from us. Our past interests will determine what we are exposed to in the future, leaving less room for the unexpected encounters that spark creativity, innovation, and the democratic exchange of ideas.While we all worry that the Internet is eroding privacy or shrinking our attention spans, Pariser uncovers a more pernicious and far-reaching trend on the Internet and shows how we can - and must - change course. With vivid detail and remarkable scope, The Filter Bubble reveals how personalization undermines the Internet's original purpose as an open platform for the spread of ideas and could leave us all in an isolated, echoing world.

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Analyzing the Analyzers


Harlan Harris - 2013
    

Naked Statistics: Stripping the Dread from the Data


Charles Wheelan - 2012
    How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.