Book picks similar to
#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time by Andy Kriebel
data-viz
data-science
data
non-fiction
Hadoop: The Definitive Guide
Tom White - 2009
Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!
Business Intelligence for Dummies
Swain Scheps - 2007
But you've heard at least a dozen definitions of what it is, and heard of at least that many BI tools. Where do you start? Business Intelligence For Dummies makes BI understandable! It takes you step by step through the technologies and the alphabet soup, so you can choose the right technology and implement a successful BI environment. You'll see how the applications and technologies work together to access, analyze, and present data that you can use to make better decisions about your products, customers, competitors, and more.You'll find out how to:Understand the principles and practical elements of BI Determine what your business needs Compare different approaches to BI Build a solid BI architecture and roadmap Design, develop, and deploy your BI plan Relate BI to data warehousing, ERP, CRM, and e-commerce Analyze emerging trends and developing BI tools to see what else may be useful Whether you're the business owner or the person charged with developing and implementing a BI strategy, checking out Business Intelligence For Dummies is a good business decision.
R Graphics Cookbook: Practical Recipes for Visualizing Data
Winston Chang - 2012
Each recipe tackles a specific problem with a solution you can apply to your own project, and includes a discussion of how and why the recipe works.Most of the recipes use the ggplot2 package, a powerful and flexible way to make graphs in R. If you have a basic understanding of the R language, you're ready to get started.Use R's default graphics for quick exploration of dataCreate a variety of bar graphs, line graphs, and scatter plotsSummarize data distributions with histograms, density curves, box plots, and other examplesProvide annotations to help viewers interpret dataControl the overall appearance of graphicsRender data groups alongside each other for easy comparisonUse colors in plotsCreate network graphs, heat maps, and 3D scatter plotsStructure data for graphing
Learning From Data: A Short Course
Yaser S. Abu-Mostafa - 2012
Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.
Artificial Intelligence: A Modern Approach
Stuart Russell - 1994
The long-anticipated revision of this best-selling text offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence. *NEW-Nontechnical learning material-Accompanies each part of the book. *NEW-The Internet as a sample application for intelligent systems-Added in several places including logical agents, planning, and natural language. *NEW-Increased coverage of material - Includes expanded coverage of: default reasoning and truth maintenance systems, including multi-agent/distributed AI and game theory; probabilistic approaches to learning including EM; more detailed descriptions of probabilistic inference algorithms. *NEW-Updated and expanded exercises-75% of the exercises are revised, with 100 new exercises. *NEW-On-line Java software. *Makes it easy for students to do projects on the web using intelligent agents. *A unified, agent-based approach to AI-Organizes the material around the task of building intelligent agents. *Comprehensive, up-to-date coverage-Includes a unified view of the field organized around the rational decision making pa
Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine
Clinton Gormley - 2014
This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with the complexities of human language, geolocation, and relationships.If you're a newcomer to both search and distributed systems, you'll quickly learn how to integrate Elasticsearch into your application. More experienced users will pick up lots of advanced techniques. Throughout the book, you'll follow a problem-based approach to learn why, when, and how to use Elasticsearch features.Understand how Elasticsearch interprets data in your documentsIndex and query your data to take advantage of search concepts such as relevance and word proximityHandle human language through the effective use of analyzers and queriesSummarize and group data to show overall trends, with aggregations and analyticsUse geo-points and geo-shapes--Elasticsearch's approaches to geolocationModel your data to take advantage of Elasticsearch's horizontal scalabilityLearn how to configure and monitor your cluster in production
Coders: The Making of a New Tribe and the Remaking of the World
Clive Thompson - 2019
And this may sound weirdly obvious, but every single one of those pieces of software was written by a programmer. Programmers are thus among the most quietly influential people on the planet. As we live in a world made of software, they're the architects. The decisions they make guide our behavior. When they make something newly easy to do, we do a lot more of it. If they make it hard or impossible to do something, we do less of it.If we want to understand how today's world works, we ought to understand something about coders. Who exactly are the people that are building today's world? What makes them tick? What type of personality is drawn to writing software? And perhaps most interestingly -- what does it do to them?One of the first pieces of coding a newbie learns is the program to make the computer say "Hello, world!" Like that piece of code, Clive Thompson's book is a delightful place to begin to understand this vocation, which is both a profession and a way of life, and which essentially didn't exist little more than a generation ago, but now is considered just about the only safe bet we can make about what the future holds. Thompson takes us close to some of the great coders of our time, and unpacks the surprising history of the field, beginning with the first great coders, who were women. Ironically, if we're going to traffic in stereotypes, women are arguably "naturally" better at coding than men, but they were written out of the history, and shoved out of the seats, for reasons that are illuminating. Now programming is indeed, if not a pure brotopia, at least an awfully homogenous community, which attracts people from a very narrow band of backgrounds and personality types. As Thompson learns, the consequences of that are significant - not least being a fetish for disruption at scale that doesn't leave much time for pondering larger moral issues of collateral damage. At the same time, coding is a marvelous new art form that has improved the world in innumerable ways, and Thompson reckons deeply, as no one before him has, with what great coding in fact looks like, who creates it, and where they come from. To get as close to his subject has he can, he picks up the thread of his own long-abandoned coding practice, and tries his mightiest to up his game, with some surprising results.More and more, any serious engagement with the world demands an engagement with code and its consequences, and to understand code, we must understand coders. In that regard, Clive Thompson's Hello, World! is a marvelous and delightful master class.
Spark: The Definitive Guide: Big Data Processing Made Simple
Bill Chambers - 2018
With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.
You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library.
Get a gentle overview of big data and Spark
Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples
Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames
Understand how Spark runs on a cluster
Debug, monitor, and tune Spark clusters and applications
Learn the power of Structured Streaming, Spark’s stream-processing engine
Learn how you can apply MLlib to a variety of problems, including classification or recommendation
The Ethical Algorithm: The Science of Socially Aware Algorithm Design
Michael Kearns - 2019
Algorithms have made our lives more efficient, more entertaining, and, sometimes, better informed. At the same time, complex algorithms are increasingly violating the basic rights of individual citizens. Allegedly anonymized datasets routinely leak our most sensitive personal information; statistical models for everything from mortgages to college admissions reflect racial and gender bias. Meanwhile, users manipulate algorithms to "game" search engines, spam filters, online reviewing services, and navigation apps.Understanding and improving the science behind the algorithms that run our lives is rapidly becoming one of the most pressing issues of this century. Traditional fixes, such as laws, regulations and watchdog groups, have proven woefully inadequate. Reporting from the cutting edge of scientific research, The Ethical Algorithm offers a new approach: a set of principled solutions based on the emerging and exciting science of socially aware algorithm design. Michael Kearns and Aaron Roth explain how we can better embed human principles into machine code - without halting the advance of data-driven scientific exploration. Weaving together innovative research with stories of citizens, scientists, and activists on the front lines, The Ethical Algorithm offers a compelling vision for a future, one in which we can better protect humans from the unintended impacts of algorithms while continuing to inspire wondrous advances in technology.
Smart Machines: IBM's Watson and the Era of Cognitive Computing
John E. Kelly III - 2013
The victory of IBM's Watson on the television quiz show Jeopardy! revealed how scientists and engineers at IBM and elsewhere are pushing the boundaries of science and technology to create machines that sense, learn, reason, and interact with people in new ways to provide insight and advice.In Smart Machines, John E. Kelly III, director of IBM Research, and Steve Hamm, a writer at IBM and a former business and technology journalist, introduce the fascinating world of "cognitive systems" to general audiences and provide a window into the future of computing. Cognitive systems promise to penetrate complexity and assist people and organizations in better decision making. They can help doctors evaluate and treat patients, augment the ways we see, anticipate major weather events, and contribute to smarter urban planning. Kelly and Hamm's comprehensive perspective describes this technology inside and out and explains how it will help us conquer the harnessing and understanding of "big data," one of the major computing challenges facing businesses and governments in the coming decades. Absorbing and impassioned, their book will inspire governments, academics, and the global tech industry to work together to power this exciting wave in innovation.
Quantum Computing Since Democritus
Scott Aaronson - 2013
Full of insights, arguments and philosophical perspectives, the book covers an amazing array of topics. Beginning in antiquity with Democritus, it progresses through logic and set theory, computability and complexity theory, quantum computing, cryptography, the information content of quantum states and the interpretation of quantum mechanics. There are also extended discussions about time travel, Newcomb's Paradox, the anthropic principle and the views of Roger Penrose. Aaronson's informal style makes this fascinating book accessible to readers with scientific backgrounds, as well as students and researchers working in physics, computer science, mathematics and philosophy.
Experimentation Works: The Surprising Power of Business Experiments
Stefan H. Thomke - 2020
Whether it's improving customer experiences, trying out new business models, or developing new products, even the most experienced managers often get it wrong. This is especially true in the online world, where predicting customer behavior is virtually impossible.Managers can, however, discover whether a new product, service, or business model will fail or succeed--by subjecting it to rigorous experimentation. Think about it. A pharmaceutical company would never introduce a new drug without first conducting a round of experiments based on established scientific protocols. Yet that's essentially what many companies do when they roll out new products and services.As Harvard Business School professor Stefan Thomke shows in this eye-opening and essential book, the "best guess" approach to innovation is changing fast. There are now leading companies that conduct more than ten thousand online controlled experiments annually, engaging millions of users. These organizations have discovered that an "experiment with everything" approach has a big payoff, giving them a considerable competitive advantage.How can you do this at your company? Leaders and managers need to create an "experimentation organization" that masters the science of testing and puts the discipline of experimentation at the center of the innovation process. It used to take companies years to build the infrastructure and develop the expertise to run hundreds of experiments each day. But Thomke shows how, with advances in technology, these capabilities are at the fingertips of almost any business professional. By combining the power of software and the rigor of controlled experiments, today's managers can make better decisions, create better customer experiences, and generate huge financial returns.Filled with engaging and instructive stories of leading experimentation organizations, Experimentation Works will be your guidebook to a truly new way of thinking and innovating.
Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart
Ian Ayres - 2007
In this lively and groundbreaking new book, economist Ian Ayres shows how today's best and brightest organizations are analyzing massive databases at lightening speed to provide greater insights into human behavior. They are the Super Crunchers. From internet sites like Google and Amazon that know your tastes better than you do, to a physician's diagnosis and your child's education, to boardrooms and government agencies, this new breed of decision makers are calling the shots. And they are delivering staggeringly accurate results. How can a football coach evaluate a player without ever seeing him play? Want to know whether the price of an airline ticket will go up or down before you buy? How can a formula outpredict wine experts in determining the best vintages? Super crunchers have the answers. In this brave new world of equation versus expertise, Ayres shows us the benefits and risks, who loses and who wins, and how super crunching can be used to help, not manipulate us.Gone are the days of solely relying on intuition to make decisions. No businessperson, consumer, or student who wants to stay ahead of the curve should make another keystroke without reading Super Crunchers.
The Humane Interface: New Directions for Designing Interactive Systems
Jef Raskin - 2000
The Humane Interface is a gourmet dish from a master chef. Five mice! --Jakob Nielsen, Nielsen Norman Group Author of Designing Web Usability: The Practice of Simplicity This unique guide to interactive system design reflects the experience and vision of Jef Raskin, the creator of the Apple Macintosh. Other books may show how to use todays widgets and interface ideas effectively. Raskin, however, demonstrates that many current interface paradigms are dead ends, and that to make computers significantly easier to use requires new approaches. He explains how to effect desperately needed changes, offering a wealth of innovative and specific interface ideas for software designers, developers, and product managers. The Apple Macintosh helped to introduce a previous revolution in computer interface design, drawing on the best available technology to establish many of the interface techniques and methods now universal in the computer industry. With this book, Raskin proves again both his farsightedness and his practicality. He also demonstrates how design ideas must be bui