Find a book to read

Book picks similar to
Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations by Ben Jones

data

non-fiction

work

data-visualization

Hands-On Machine Learning with Scikit-Learn and TensorFlow

Aurélien Géron - 2017

A series of Deep Learning breakthroughs have boosted the whole field of machine learning over the last decade.

Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details

Naked Statistics: Stripping the Dread from the Data

Charles Wheelan - 2012

Once considered tedious, the field of statistics is rapidly evolving into a discipline Hal Varian, chief economist at Google, has actually called “sexy.” From batting averages and political polls to game shows and medical research, the real-world application of statistics continues to grow by leaps and bounds.

How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.

Introducing Microsoft Power BI

Alberto Ferrari - 2016

Get started quickly with Microsoft Power BI! Experts Alberto Ferrari and Marco Russo will help you bring your data to life, transforming your company’s data into rich visuals for you to collect and organize, allowing you to focus on what matters most to you.

Stay in the know, spot trends as they happen, and push your business to new limits. This e-book introduces Microsoft Power BI basics through a practical, scenario-based guided tour of the tool, showing you how to build analytical solutions using Power BI. Get an overview of Power BI, or dig deeper and follow along on your PC using the book's examples.

Information Graphics

Sandra Rendgen - 2011

How complex ideas can be communicated via graphics“If you can’t explain it simply, you don’t understand it well enough.”—Albert Einstein Our everyday lives are filled with a massive flow of information that we must interpret in order to understand the world we live in.

Considering this complex variety of data floating around us, sometimes the best — or even only — way to communicate is visually. This unique book presents a fascinating historical perspective on the subject, highlighting the work of the masters of the profession who have created a number of breakthroughs that have changed the way we communicate. Information Graphics has been conceived and designed not just for designers or graphics professionals, but for anyone interested in the history and practice of communicating visually. The in-depth introductory section, illustrated with over 60 images (each accompanied by an explanatory caption), features essays by Sandra Rendgen, Paolo Ciuccarelli, Richard Saul Wurman, and Simon Rogers; looking back all the way to primitive cave paintings as a means of communication, this introductory section gives readers an excellent overview of the subject. The second part of the book is entirely dedicated to contemporary works by the current most renowned professionals, presenting 200 graphics projects, with over 400 examples — each with a fact sheet and an explanation of methods and objectives — divided into chapters by the subjects Location, Time, Category, and Hierarchy.Features:200 projects and over 400 examples of contemporary information graphics from all over the world—ranging from journalism to art, government, education, business and much more Historical essays about the development of information graphics since its beginnings Exclusive poster (673 x 475 mm / 26.5 x 18.7 in) by Nigel Homes, who during his 20 years as graphics director for TIME revolutionized the way the magazine used information graphics

Deep Learning

machine-learning

computer-science

data-science

Ian Goodfellow - 2016

An introduction to a broad range of topics in deep learning, covering mathematical and conceptual background, deep learning techniques used in industry, and research perspectives.Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts.

Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling

Ralph Kimball - 1996

The latest edition of the single most authoritative guide on dimensional modeling for data warehousing! Dimensional modeling has become the most widely accepted approach for data warehouse design.

Here is a complete library of dimensional modeling techniques-- the most comprehensive collection ever written. Greatly expanded to cover both basic and advanced techniques for optimizing data warehouse design, this second edition to Ralph Kimball's classic guide is more than sixty percent updated.The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including:* Retail sales and e-commerce* Inventory management* Procurement* Order management* Customer relationship management (CRM)* Human resources management* Accounting* Financial services* Telecommunications and utilities* Education* Transportation* Health care and insuranceBy the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.This book is also available as part of the Kimball's Data Warehouse Toolkit Classics Box Set (ISBN: 9780470479575) with the following 3 books:The Data Warehouse Toolkit, 2nd Edition (9780471200246)The Data Warehouse Lifecycle Toolkit, 2nd Edition (9780470149775)The Data Warehouse ETL Toolkit (9780764567575)

R Programming for Data Science

Roger D. Peng - 2015

Practical Statistics for Data Scientists: 50 Essential Concepts

Peter Bruce - 2017

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training.

Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Data Visualization: A Practical Introduction

Kieran Healy - 2018

An accessible primer on how to create effective graphics from dataThis book provides students and researchers a hands-on introduction to the principles and practice of data visualization.

It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way.Data Visualization builds the reader's expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective "small multiple" plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible.Effective graphics are essential to communicating ideas and a great way to better understand data. This book provides the practical skills students and practitioners need to visualize quantitative data and get the most out of their research findings.Provides hands-on instruction using R and ggplot2Shows how the "tidyverse" of data analysis tools makes working with R easier and more consistentIncludes a library of data sets, code, and functions

The Pragmatic Programmer: From Journeyman to Master

Andy Hunt - 1999

Straight from the programming trenches, The Pragmatic Programmer cuts through the increasing specialization and technicalities of modern software development to examine the core process--taking a requirement and producing working, maintainable code that delights its users.

It covers topics ranging from personal responsibility and career development to architectural techniques for keeping your code flexible and easy to adapt and reuse. Read this book, and you'll learn how toFight software rot; Avoid the trap of duplicating knowledge; Write flexible, dynamic, and adaptable code; Avoid programming by coincidence; Bullet-proof your code with contracts, assertions, and exceptions; Capture real requirements; Test ruthlessly and effectively; Delight your users; Build teams of pragmatic programmers; and Make your developments more precise with automation. Written as a series of self-contained sections and filled with entertaining anecdotes, thoughtful examples, and interesting analogies, The Pragmatic Programmer illustrates the best practices and major pitfalls of many different aspects of software development. Whether you're a new coder, an experienced programmer, or a manager responsible for software projects, use these lessons daily, and you'll quickly see improvements in personal productivity, accuracy, and job satisfaction. You'll learn skills and develop habits and attitudes that form the foundation for long-term success in your career. You'll become a Pragmatic Programmer.

DAX Formulas for PowerPivot: The Excel Pro's Guide to Mastering DAX

Rob Collie - 2012

Microsoft PowerPivot is a free add-on to Excel from Microsoft that allows users to produce new kinds of reports and analyses that were simply impossible before, and this book is the first to tackle DAX formulas, the core capability of PowerPivot, from the perspective of the Excel audience.

Written by the world’s foremost PowerPivot blogger and practitioner, the book’s concepts and approach are introduced in a simple, step-by-step manner tailored to the learning style of Excel users everywhere. The techniques presented allow users to produce, in hours or even minutes, results that formerly would have taken entire teams weeks or months to produce and include lessons on the difference between calculated columns and measures, how formulas can be reused across reports of completely different shapes, how to merge disjointed sets of data into unified reports, how to make certain columns in a pivot behave as if the pivot were filtered while other columns do not, and how to create time-intelligent calculations in pivot tables such as “Year over Year” and “Moving Averages” whether they use a standard, fiscal, or a complete custom calendar. The “pattern-like” techniques and best practices contained in this book have been developed and refined over two years of onsite training with Excel users around the world, and the key lessons from those seminars costing thousands of dollars per day are now available to within the pages of this easy-to-follow guide.

Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures

Claus O. Wilke - 2019

Effective visualization is the best way to communicate information from the increasingly large and complex datasets in the natural and social sciences.

But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.Explore the basic concepts of color as a tool to highlight, distinguish, or represent a valueUnderstand the importance of redundant coding to ensure you provide key information in multiple waysUse the book's visualizations directory, a graphical guide to commonly used types of data visualizationsGet extensive examples of good and bad figuresLearn how to use figures in a document or report and how employ them effectively to tell a compelling story

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists

Philipp K. Janert - 2010

Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need.

With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

Clean Code: A Handbook of Agile Software Craftsmanship

Robert C. Martin - 2007

Even bad code can function.

But if code isn't clean, it can bring a development organization to its knees. Every year, countless hours and significant resources are lost because of poorly written code. But it doesn't have to be that way. Noted software expert Robert C. Martin presents a revolutionary paradigm with Clean Code: A Handbook of Agile Software Craftsmanship . Martin has teamed up with his colleagues from Object Mentor to distill their best agile practice of cleaning code on the fly into a book that will instill within you the values of a software craftsman and make you a better programmer but only if you work at it. What kind of work will you be doing? You'll be reading code - lots of code. And you will be challenged to think about what's right about that code, and what's wrong with it. More importantly, you will be challenged to reassess your professional values and your commitment to your craft. Clean Code is divided into three parts. The first describes the principles, patterns, and practices of writing clean code. The second part consists of several case studies of increasing complexity. Each case study is an exercise in cleaning up code - of transforming a code base that has some problems into one that is sound and efficient. The third part is the payoff: a single chapter containing a list of heuristics and "smells" gathered while creating the case studies. The result is a knowledge base that describes the way we think when we write, read, and clean code. Readers will come away from this book understanding ‣ How to tell the difference between good and bad code‣ How to write good code and how to transform bad code into good code‣ How to create good names, good functions, good objects, and good classes‣ How to format code for maximum readability ‣ How to implement complete error handling without obscuring code logic ‣ How to unit test and practice test-driven development This book is a must for any developer, software engineer, project manager, team lead, or systems analyst with an interest in producing better code.

Pattern Recognition and Machine Learning

machine-learning

computer-science

data-science

Christopher M. Bishop - 2006

Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science.

However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.

Book picks similar toAvoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations by Ben Jones

Hands-On Machine Learning with Scikit-Learn and TensorFlow

Naked Statistics: Stripping the Dread from the Data

Introducing Microsoft Power BI

Information Graphics

Deep Learning

The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling

R Programming for Data Science

Practical Statistics for Data Scientists: 50 Essential Concepts

Data Visualization: A Practical Introduction

The Pragmatic Programmer: From Journeyman to Master

DAX Formulas for PowerPivot: The Excel Pro's Guide to Mastering DAX

Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists

Clean Code: A Handbook of Agile Software Craftsmanship

Pattern Recognition and Machine Learning

Book picks similar to
Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations by Ben Jones