Book picks similar to
The Human Face of Big Data by Rick Smolan
non-fiction
data-science
big-data
business
Code Complete
Steve McConnell - 1993
Now this classic book has been fully updated and revised with leading-edge practices--and hundreds of new code samples--illustrating the art and science of software construction. Capturing the body of knowledge available from research, academia, and everyday commercial practice, McConnell synthesizes the most effective techniques and must-know principles into clear, pragmatic guidance. No matter what your experience level, development environment, or project size, this book will inform and stimulate your thinking--and help you build the highest quality code. Discover the timeless techniques and strategies that help you: Design for minimum complexity and maximum creativity Reap the benefits of collaborative development Apply defensive programming techniques to reduce and flush out errors Exploit opportunities to refactor--or evolve--code, and do it safely Use construction practices that are right-weight for your project Debug problems quickly and effectively Resolve critical construction issues early and correctly Build quality into the beginning, middle, and end of your project
Spark: The Definitive Guide: Big Data Processing Made Simple
Bill Chambers - 2018
With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.
You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library.
Get a gentle overview of big data and Spark
Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples
Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames
Understand how Spark runs on a cluster
Debug, monitor, and tune Spark clusters and applications
Learn the power of Structured Streaming, Spark’s stream-processing engine
Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Machine Learning: A Visual Starter Course For Beginner's
Oliver Theobald - 2017
If you have ever found yourself lost halfway through other introductory materials on this topic, this is the book for you. If you don't understand set terminology such as vectors, hyperplanes, and centroids, then this is also the book for you. This starter course isn't a picture story book but does include many visual examples that break algorithms down into a digestible and practical format. As a starter course, this book connects the dots and offers the crash course I wish I had when I first started. The kind of guide I wish had before I started taking on introductory courses that presume you’re two days away from an advanced mathematics exam. That’s why this introductory course doesn’t go further on the subject than other introductory books, but rather, goes a step back. A half-step back in order to help everyone make his or her first strides in machine learning and is an ideal study companion for the visual learner. In this step-by-step guide you will learn: - How to download free datasets - What tools and software packages you need - Data scrubbing techniques, including one-hot encoding, binning and dealing with missing data - Preparing data for analysis, including k-fold Validation - Regression analysis to create trend lines - Clustering, including k-means and k-nearest Neighbors - Naive Bayes Classifier to predict new classes - Anomaly detection and SVM algorithms to combat anomalies and outliers - The basics of Neural Networks - Bias/Variance to improve your machine learning model - Decision Trees to decode classification
Please feel welcome to join this starter course by buying a copy, or sending a free sample to your preferred device.
Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor
Virginia Eubanks - 2018
In Pittsburgh, a child welfare agency uses a statistical model to try to predict which children might be future victims of abuse or neglect.Since the dawn of the digital age, decision-making in finance, employment, politics, health and human services has undergone revolutionary change. Today, automated systems—rather than humans—control which neighborhoods get policed, which families attain needed resources, and who is investigated for fraud. While we all live under this new regime of data, the most invasive and punitive systems are aimed at the poor.In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America. The book is full of heart-wrenching and eye-opening stories, from a woman in Indiana whose benefits are literally cut off as she lays dying to a family in Pennsylvania in daily fear of losing their daughter because they fit a certain statistical profile.The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute. Like the county poorhouse and scientific charity before them, digital tracking and automated decision-making hide poverty from the middle-class public and give the nation the ethical distance it needs to make inhumane choices: which families get food and which starve, who has housing and who remains homeless, and which families are broken up by the state. In the process, they weaken democracy and betray our most cherished national values.This deeply researched and passionate book could not be more timely.Naomi Klein: "This book is downright scary."Ethan Zuckerman, MIT: "Should be required reading."Dorothy Roberts, author of Killing the Black Body: "A must-read for everyone concerned about modern tools of inequality in America."Astra Taylor, author of The People's Platform: "This is the single most important book about technology you will read this year."
A New Kind of Science
Stephen Wolfram - 1997
Wolfram lets the world see his work in A New Kind of Science, a gorgeous, 1,280-page tome more than a decade in the making. With patience, insight, and self-confidence to spare, Wolfram outlines a fundamental new way of modeling complex systems. On the frontier of complexity science since he was a boy, Wolfram is a champion of cellular automata--256 "programs" governed by simple nonmathematical rules. He points out that even the most complex equations fail to accurately model biological systems, but the simplest cellular automata can produce results straight out of nature--tree branches, stream eddies, and leopard spots, for instance. The graphics in A New Kind of Science show striking resemblance to the patterns we see in nature every day. Wolfram wrote the book in a distinct style meant to make it easy to read, even for nontechies; a basic familiarity with logic is helpful but not essential. Readers will find themselves swept away by the elegant simplicity of Wolfram's ideas and the accidental artistry of the cellular automaton models. Whether or not Wolfram's revolution ultimately gives us the keys to the universe, his new science is absolutely awe-inspiring. --Therese Littleton
Blockchain Revolution: How the Technology Behind Bitcoin Is Changing Money, Business, and the World
Don Tapscott - 2016
But it is much more than that, too. It is a public ledger to which everyone has access, but which no single person controls. It allows for companies and individuals to collaborate with an unprecedented degree of trust and transparency. It is cryptographically secure, but fundamentally open. And soon it will be everywhere.In Blockchain Revolution, Don and Alex Tapscott reveal how this game-changing technology will shape the future of the world economy, dramatically improving everything from healthcare records to online voting, and from insurance claims to artist royalty payments. Brilliantly researched and highly accessible, this is the essential text on the next major paradigm shift. Read it, or be left behind.
JavaScript: The Definitive Guide
David Flanagan - 1996
This book is both an example-driven programmer's guide and a keep-on-your-desk reference, with new chapters that explain everything you need to know to get the most out of JavaScript, including:Scripted HTTP and Ajax XML processing Client-side graphics using the canvas tag Namespaces in JavaScript--essential when writing complex programs Classes, closures, persistence, Flash, and JavaScript embedded in Java applicationsPart I explains the core JavaScript language in detail. If you are new to JavaScript, it will teach you the language. If you are already a JavaScript programmer, Part I will sharpen your skills and deepen your understanding of the language.Part II explains the scripting environment provided by web browsers, with a focus on DOM scripting with unobtrusive JavaScript. The broad and deep coverage of client-side JavaScript is illustrated with many sophisticated examples that demonstrate how to:Generate a table of contents for an HTML document Display DHTML animations Automate form validation Draw dynamic pie charts Make HTML elements draggable Define keyboard shortcuts for web applications Create Ajax-enabled tool tips Use XPath and XSLT on XML documents loaded with Ajax And much morePart III is a complete reference for core JavaScript. It documents every class, object, constructor, method, function, property, and constant defined by JavaScript 1.5 and ECMAScript Version 3.Part IV is a reference for client-side JavaScript, covering legacy web browser APIs, the standard Level 2 DOM API, and emerging standards such as the XMLHttpRequest object and the canvas tag.More than 300,000 JavaScript programmers around the world have made this their indispensable reference book for building JavaScript applications."A must-have reference for expert JavaScript programmers...well-organized and detailed."-- Brendan Eich, creator of JavaScript
Code: Version 2.0
Lawrence Lessig - 1999
Harvard Professor Lawrence Lessig warns that, if we're not careful we'll wake up one day to discover that the character of cyberspace has changed from under us. Cyberspace will no longer be a world of relative freedom; instead it will be a world of perfect control where our identities, actions, and desires are monitored, tracked, and analyzed for the latest market research report. Commercial forces will dictate the change, and architecture—the very structure of cyberspace itself—will dictate the form our interactions can and cannot take. Code And Other Laws of Cyberspace is an exciting examination of how the core values of cyberspace as we know it—intellectual property, free speech, and privacy-—are being threatened and what we can do to protect them. Lessig shows how code—the architecture and law of cyberspace—can make a domain, site, or network free or restrictive; how technological architectures influence people's behavior and the values they adopt; and how changes in code can have damaging consequences for individual freedoms. Code is not just for lawyers and policymakers; it is a must-read for everyone concerned with survival of democratic values in the Information Age.
Googled: The End of the World as We Know It
Ken Auletta - 2009
This is a ride on the Google wave, and the fullest account of how it formed and crashed into traditional media businesses. With unprecedented access to Google's founders and executives, as well as to those in media who are struggling to keep their heads above water, Ken Auletta reveals how the industry is being disrupted and redefined.Auletta goes inside Google's closed-door meetings, introducing Google's notoriously private founders, Larry Page and Sergey Brin, as well as those who work with - and against - them. In Googled, the reader discovers the 'secret sauce' of the company's success and why the worlds of 'new' and 'old' media often communicate as if residents of different planets. It may send chills down traditionalists' spines, but it's a crucial roadmap to the future of media business: the Google story may well be the canary in the coal mine.Googled is candid, objective and authoritative. Crucially, it's not just a history or reportage: it's ahead of the curve and unlike any other Google books, which tend to have been near-histories, somewhat starstruck, now out of date or which fail to look at the full synthesis of business and technology.
Are You Smart Enough to Work at Google?
William Poundstone - 2012
The blades start moving in 60 seconds. What do you do? If you want to work at Google, or any of America's best companies, you need to have an answer to this and other puzzling questions. Are You Smart Enough to Work at Google? guides readers through the surprising solutions to dozens of the most challenging interview questions. The book covers the importance of creative thinking, ways to get a leg up on the competition, what your Facebook page says about you, and much more. Are You Smart Enough to Work at Google? is a must-read for anyone who wants to succeed in today's job market.
Cassandra: The Definitive Guide
Eben Hewitt - 2010
Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.Understand the tenets of Cassandra's column-oriented structureLearn how to write, update, and read Cassandra dataDiscover how to add or remove nodes from the cluster as your application requiresExamine a working application that translates from a relational model to Cassandra's data modelUse examples for writing clients in Java, Python, and C#Use the JMX interface to monitor a cluster's usage, memory patterns, and moreTune memory settings, data storage, and caching for better performance
Problem Solving with Algorithms and Data Structures Using Python
Bradley N. Miller - 2005
It is also about Python. However, there is much more. The study of algorithms and data structures is central to understanding what computer science is all about. Learning computer science is not unlike learning any other type of difficult subject matter. The only way to be successful is through deliberate and incremental exposure to the fundamental ideas. A beginning computer scientist needs practice so that there is a thorough understanding before continuing on to the more complex parts of the curriculum. In addition, a beginner needs to be given the opportunity to be successful and gain confidence. This textbook is designed to serve as a text for a first course on data structures and algorithms, typically taught as the second course in the computer science curriculum. Even though the second course is considered more advanced than the first course, this book assumes you are beginners at this level. You may still be struggling with some of the basic ideas and skills from a first computer science course and yet be ready to further explore the discipline and continue to practice problem solving. We cover abstract data types and data structures, writing algorithms, and solving problems. We look at a number of data structures and solve classic problems that arise. The tools and techniques that you learn here will be applied over and over as you continue your study of computer science.
The Psychology of Computer Programming
Gerald M. Weinberg - 1971
Weinberg adds new insights and highlights the similarities and differences between now and then. Using a conversational style that invites the reader to join him, Weinberg reunites with some of his most insightful writings on the human side of software engineering.Topics include egoless programming, intelligence, psychological measurement, personality factors, motivation, training, social problems on large projects, problem-solving ability, programming language design, team formation, the programming environment, and much more.Dorset House Publishing is proud to make this important text available to new generations of programmers -- and to encourage readers of the first edition to return to its valuable lessons.
Machine Learning Yearning
Andrew Ng
But building a machine learning system requires that you make practical decisions: Should you collect more training data? Should you use end-to-end deep learning? How do you deal with your training set not matching your test set? and many more. Historically, the only way to learn how to make these "strategy" decisions has been a multi-year apprenticeship in a graduate program or company. This is a book to help you quickly gain this skill, so that you can become better at building AI systems.
Applied Predictive Modeling
Max Kuhn - 2013
Non- mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms. Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance-all of which are problems that occur frequently in practice. The text illustrates all parts of the modeling process through many hands-on, real-life examples. And every chapter contains extensive R code f