Data Science for Business: What you need to know about data mining and data-analytic thinking


Foster Provost - 2013
    This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

The Visual Display of Quantitative Information


Edward R. Tufte - 1983
    Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.

Storytelling with Data: A Data Visualization Guide for Business Professionals


Cole Nussbaumer Knaflic - 2015
    You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples--ready for immediate application to your next graph or presentation.Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to:Understand the importance of context and audience Determine the appropriate type of graph for your situation Recognize and eliminate the clutter clouding your information Direct your audience's attention to the most important parts of your data Think like a designer and utilize concepts of design in data visualization Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data--Storytelling with Data will give you the skills and power to tell it!

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Creating a Data-Driven Organization: Practical Advice from the Trenches


Carl Anderson - 2015
    This practical book shows you how true data-drivenness involves processes that require genuine buy-in across your company, from analysts and management to the C-Suite and the board.Through interviews and examples from data scientists and analytics leaders in a variety of industries, author Carl Anderson explains the analytics value chain you need to adopt when building predictive business models—from data collection and analysis to the insights and leadership that drive concrete actions. You’ll learn what works and what doesn’t, and why creating a data-driven culture throughout your organization is essential. Start from the bottom up: learn how to collect the right data the right way Hire analysts with the right skills, and organize them into teams Examine statistical and visualization tools, and fact-based story-telling methods Collect and analyze data while respecting privacy and ethics Understand how analysts and their managers can help spur a data-driven culture Learn the importance of data leadership and C-level positions such as chief data officer and chief analytics officer

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die


Eric Siegel - 2013
    Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.

ggplot2: Elegant Graphics for Data Analysis


Hadley Wickham - 2009
    1 Welcome to ggplot2 ggplot2 is an R package for producing statistical, or data, graphics, but it is unlike most other graphics packages because it has a deep underlying grammar. This grammar, based on the Grammar of Graphics (Wilkinson, 2005), is composed of a set of independent components that can be composed in many di?erent ways. This makesggplot2 very powerful, because you are not limited to a set of pre-speci?ed graphics, but you can create new graphics that are precisely tailored for your problem. This may sound overwhelming, but because there is a simple set of core principles and very few special cases, ggplot2 is also easy to learn (although it may take a little time to forget your preconceptions from other graphics tools). Practically, ggplot2 provides beautiful, hassle-free plots, that take care of ?ddly details like drawing legends. The plots can be built up iteratively and edited later. A carefully chosen set of defaults means that most of the time you can produce a publication-quality graphic in seconds, but if you do have special formatting requirements, a comprehensive theming system makes it easy to do what you want. Instead of spending time making your graph look pretty, you can focus on creating a graph that best reveals the messages in your data

Competing on Analytics: The New Science of Winning


Thomas H. Davenport - 2007
    But are you using it to “out-think” your rivals? If not, you may be missing out on a potent competitive tool.In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling.Exemplars of analytics are using new tools to identify their most profitable customers and offer them the right price, to accelerate product innovation, to optimize supply chains, and to identify the true drivers of financial performance. A wealth of examples—from organizations as diverse as Amazon, Barclay’s, Capital One, Harrah’s, Procter & Gamble, Wachovia, and the Boston Red Sox—illuminate how to leverage the power of analytics.

Planning for Big Data


Edd Wilder-James - 2004
    From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work

Predictive Analytics for Dummies


Anasse Bari - 2013
    Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions.Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more.Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.

Hands-On Machine Learning with Scikit-Learn and TensorFlow


Aurélien Géron - 2017
    Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details

Big Data Now: 2012 Edition


O'Reilly Media Inc. - 2012
    It's not just a technical book or just a businessguide. Data is ubiquitous and it doesn't pay much attention toborders, so we've calibrated our coverage to follow it wherever itgoes.In the first edition of Big Data Now, the O'Reilly team tracked thebirth and early development of data tools and data science. Now, withthis second edition, we're seeing what happens when big data grows up:how it's being applied, where it's playing a role, and theconsequences -- good and bad alike -- of data's ascendance.We've organized the second edition of Big Data Now into five areas:Getting Up to Speed With Big Data -- Essential information on thestructures and definitions of big data.Big Data Tools, Techniques, and Strategies -- Expert guidance forturning big data theories into big data products.The Application of Big Data -- Examples of big data in action,including a look at the downside of data.What to Watch for in Big Data -- Thoughts on how big data will evolveand the role it will play across industries and domains.Big Data and Health Care -- A special section exploring thepossibilities that arise when data and health care come together.

Information Dashboard Design: The Effective Visual Communication of Data


Stephen Few - 2006
    Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.