Book picks similar to
Big Data Now: 2012 Edition by O'Reilly Media Inc.
technology
non-fiction
tech
big-data
Planning for Big Data
Edd Wilder-James - 2004
From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work
Disruptive Possibilities: How Big Data Changes Everything
Jeffrey Needham - 2013
As author Jeffrey Needham points out in this eye-opening book, big data can provide unprecedented insight into user habits, giving enterprises a huge market advantage. It will also inspire organizations to change the way they function."Disruptive Possibilities: How Big Data Changes Everything" takes you on a journey of discovery into the emerging world of big data, from its relatively simple technology to the ways it differs from cloud computing. But the big story of big data is the disruption of enterprise status quo, especially vendor-driven technology silos and budget-driven departmental silos. In the highly collaborative environment needed to make big data work, silos simply don't fit.Internet-scale computing offers incredible opportunity and a tremendous challenge--and it will soon become standard operating procedure in the enterprise. This book shows you what to expect.
Hadoop Explained
Aravind Shenoy - 2014
Hadoop allowed small and medium sized companies to store huge amounts of data on cheap commodity servers in racks. The introduction of Big Data has allowed businesses to make decisions based on quantifiable analysis. Hadoop is now implemented in major organizations such as Amazon, IBM, Cloudera, and Dell to name a few. This book introduces you to Hadoop and to concepts such as ‘MapReduce’, ‘Rack Awareness’, ‘Yarn’ and ‘HDFS Federation’, which will help you get acquainted with the technology.
Big Data Now: Current Perspectives from O'Reilly Radar
O'Reilly Radar Team - 2011
Mike Loukides kicked things off in June 2010 with “What is data science?” and from there we’ve pursued the various threads and themes that naturally emerged. Now, roughly a year later, we can look back over all we’ve covered and identify a number of core data areas: Data issues -- The opportunities and ambiguities of the data space are evident in discussions around privacy, the implications of data-centric industries, and the debate about the phrase “data science” itself. The application of data: products and processes – A “data product” can emerge from virtually any domain, including everything from data startups to established enterprises to media/journalism to education and research. Data science and data tools -- The tools and technologies that drive data science are of course essential to this space, but the varied techniques being applied are also key to understanding the big data arena.The business of data – Take a closer look at the actions connected to data -- the finding, organizing, and analyzing that provide organizations of all sizes with the information they need to compete.
Real-Time Big Data Analytics: Emerging Architecture
Mike Barlow - 2013
The data world was revolutionized a few years ago when Hadoop and other tools made it possible to getthe results from queries in minutes. But the revolution continues. Analysts now demand sub-second, near real-time query results. Fortunately, we have the tools to deliver them. This report examines tools and technologies that are driving real-time big data analytics.
Data Science for Business: What you need to know about data mining and data-analytic thinking
Foster Provost - 2013
This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates
Data Science from Scratch: First Principles with Python
Joel Grus - 2015
In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.
If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.
Get a crash course in Python
Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science
Collect, explore, clean, munge, and manipulate data
Dive into the fundamentals of machine learning
Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering
Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
Data and Goliath: The Hidden Battles to Collect Your Data and Control Your World
Bruce Schneier - 2015
Your online and in-store purchasing patterns are recorded, and reveal if you're unemployed, sick, or pregnant. Your e-mails and texts expose your intimate and casual friends. Google knows what you’re thinking because it saves your private searches. Facebook can determine your sexual orientation without you ever mentioning it.The powers that surveil us do more than simply store this information. Corporations use surveillance to manipulate not only the news articles and advertisements we each see, but also the prices we’re offered. Governments use surveillance to discriminate, censor, chill free speech, and put people in danger worldwide. And both sides share this information with each other or, even worse, lose it to cybercriminals in huge data breaches.Much of this is voluntary: we cooperate with corporate surveillance because it promises us convenience, and we submit to government surveillance because it promises us protection. The result is a mass surveillance society of our own making. But have we given up more than we’ve gained? In Data and Goliath, security expert Bruce Schneier offers another path, one that values both security and privacy. He brings his bestseller up-to-date with a new preface covering the latest developments, and then shows us exactly what we can do to reform government surveillance programs, shake up surveillance-based business models, and protect our individual privacy. You'll never look at your phone, your computer, your credit cards, or even your car in the same way again.
Python for Data Analysis
Wes McKinney - 2011
It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Building Data Science Teams
D.J. Patil - 2011
In this in-depth report, data scientist DJ Patil explains the skills, perspectives, tools and processes that position data science teams for success.Topics include: What it means to be "data driven." The unique roles of data scientists. The four essential qualities of data scientists. Patil's first-hand experience building the LinkedIn data science team.
An Introduction to Statistical Learning: With Applications in R
Gareth James - 2013
This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Big Data: A Revolution That Will Transform How We Live, Work, and Think
Viktor Mayer-Schönberger - 2013
“Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com
The Visual Display of Quantitative Information
Edward R. Tufte - 1983
Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.