Book picks similar to
Beautiful Visualization: Looking at Data through the Eyes of Experts by Julie Steele
data-visualization
design
visualization
data-science
Automate the Boring Stuff with Python: Practical Programming for Total Beginners
Al Sweigart - 2014
But what if you could have your computer do them for you?In "Automate the Boring Stuff with Python," you'll learn how to use Python to write programs that do in minutes what would take you hours to do by hand no prior programming experience required. Once you've mastered the basics of programming, you'll create Python programs that effortlessly perform useful and impressive feats of automation to: Search for text in a file or across multiple filesCreate, update, move, and rename files and foldersSearch the Web and download online contentUpdate and format data in Excel spreadsheets of any sizeSplit, merge, watermark, and encrypt PDFsSend reminder emails and text notificationsFill out online formsStep-by-step instructions walk you through each program, and practice projects at the end of each chapter challenge you to improve those programs and use your newfound skills to automate similar tasks.Don't spend your time doing work a well-trained monkey could do. Even if you've never written a line of code, you can make your computer do the grunt work. Learn how in "Automate the Boring Stuff with Python.""
Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design
Michael J. Hernandez - 1996
You d be up to your neck in normal forms before you even had a chance to wade. When Michael J. Hernandez needed a database design book to teach mere mortals like himself, there were none. So he began a personal quest to learn enough to write one. And he did.Now in its Second Edition, Database Design for Mere Mortals is a miracle for today s generation of database users who don t have the background -- or the time -- to learn database design the hard way. It s also a secret pleasure for working pros who are occasionally still trying to figure out what they were taught.Drawing on 13 years of database teaching experience, Hernandez has organized database design into several key principles that are surprisingly easy to understand and remember. He illuminates those principles using examples that are generic enough to help you with virtually any application.Hernandez s goals are simple. You ll learn how to create a sound database structure as easily as possible. You ll learn how to optimize your structure for efficiency and data integrity. You ll learn how to avoid problems like missing, incorrect, mismatched, or inaccurate data. You ll learn how to relate tables together to make it possible to get whatever answers you need in the future -- even if you haven t thought of the questions yet.If -- as is often the case -- you already have a database, Hernandez explains how to analyze it -- and leverage it. You ll learn how to identify new information requirements, determine new business rules that need to be applied, and apply them.Hernandez starts with an introduction to databases, relational databases, and the idea and objectives of database design. Next, you ll walk through the key elements of the database design process: establishing table structures and relationships, assigning primary keys, setting field specifications, and setting up views. Hernandez s extensive coverage of data integrity includes a full chapter on establishing business rules and using validation tables.Hernandez surveys bad design techniques in a chapter on what not to do -- and finally, helps you identify those rare instances when it makes sense to bend or even break the conventional rules of database design.There s plenty that s new in this edition. Hernandez has gone over his text and illustrations with a fine-tooth comb to improve their already impressive clarity. You ll find updates to reflect new advances in technology, including web database applications. There are expanded and improved discussions of nulls and many-to-many relationships; multivalued fields; primary keys; and SQL data type fields. There s a new Quick Reference database design flowchart. A new glossary. New review questions at the end of every chapter.Finally, it s worth mentioning what this book isn t. It isn t a guide to any specific database platform -- so you can use it whether you re running Access, SQL Server, or Oracle, MySQL or PostgreSQL. And it isn t an SQL guide. (If that s what you need, Michael J. Hernandez has also coauthored the superb SQL Queries for Mere Mortals). But if database design is what you need to learn, this book s worth its weight in gold. Bill CamardaBill Camarda is a consultant, writer, and web/multimedia content developer. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks for Dummies, Second Edition.
Mining of Massive Datasets
Anand Rajaraman - 2011
This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.
What Is Data Science?
Mike Loukides - 2011
Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.
R Graphics Cookbook: Practical Recipes for Visualizing Data
Winston Chang - 2012
Each recipe tackles a specific problem with a solution you can apply to your own project, and includes a discussion of how and why the recipe works.Most of the recipes use the ggplot2 package, a powerful and flexible way to make graphs in R. If you have a basic understanding of the R language, you're ready to get started.Use R's default graphics for quick exploration of dataCreate a variety of bar graphs, line graphs, and scatter plotsSummarize data distributions with histograms, density curves, box plots, and other examplesProvide annotations to help viewers interpret dataControl the overall appearance of graphicsRender data groups alongside each other for easy comparisonUse colors in plotsCreate network graphs, heat maps, and 3D scatter plotsStructure data for graphing
An Introduction to Statistical Learning: With Applications in R
Gareth James - 2013
This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Python Data Science Handbook: Tools and Techniques for Developers
Jake Vanderplas - 2016
Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms
The Elements of Data Analytic Style
Jeffrey Leek - 2015
This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. The author is one of the co-developers of the Johns Hopkins Specialization in Data Science the largest data science program in the world that has enrolled more than 1.76 million people. The book is useful as a companion to introductory courses in data science or data analysis. It is also a useful reference tool for people tasked with reading and critiquing data analyses. It is based on the authors popular open-source guides available through his Github account (https://github.com/jtleek). The paper is also available through Leanpub (https://leanpub.com/datastyle), if the book is purchased on that platform you are entitled to lifetime free updates.
Cracking the PM Interview: How to Land a Product Manager Job in Technology
Gayle Laakmann McDowell - 2013
Cracking the PM Interview is a comprehensive book about landing a product management role in a startup or bigger tech company. Learn how the ambiguously-named "PM" (product manager / program manager) role varies across companies, what experience you need, how to make your existing experience translate, what a great PM resume and cover letter look like, and finally, how to master the interview: estimation questions, behavioral questions, case questions, product questions, technical questions, and the super important "pitch."
Thinking with Data
Max Shron - 2014
In this practical guide, data strategy consultant Max Shron shows you how to put the why before the how, through an often-overlooked set of analytical skills.Thinking with Data helps you learn techniques for turning data into knowledge you can use. You’ll learn a framework for defining your project, including the data you want to collect, and how you intend to approach, organize, and analyze the results. You’ll also learn patterns of reasoning that will help you unveil the real problem that needs to be solved.Learn a framework for scoping data projectsUnderstand how to pin down the details of an idea, receive feedback, and begin prototypingUse the tools of arguments to ask good questions, build projects in stages, and communicate resultsExplore data-specific patterns of reasoning and learn how to build more useful argumentsDelve into causal reasoning and learn how it permeates data workPut everything together, using extended examples to see the method of full problem thinking in action
Pattern Recognition and Machine Learning
Christopher M. Bishop - 2006
However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die
Eric Siegel - 2013
Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.
Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software
Dan Murray - 2013
It illustrates little-known features and techniques for getting the most from the Tableau toolset, supporting the needs of the business analysts who use the product as well as the data and IT managers who support it.This comprehensive guide covers the core feature set for data analytics, illustrating best practices for creating and sharing specific types of dynamic data visualizations. Featuring a helpful full-color layout, the book covers analyzing data with Tableau Desktop, sharing information with Tableau Server, understanding Tableau functions and calculations, and Use Cases for Tableau Software.Includes little-known, as well as more advanced features and techniques, using detailed, real-world case studies that the author has developed as part of his consulting and training practice Explains why and how Tableau differs from traditional business information analysis tools Shows you how to deploy dashboards and visualizations throughout the enterprise Provides a detailed reference resource that is aimed at users of all skill levels Depicts ways to leverage Tableau across the value chain in the enterprise through case studies that target common business requirements Endorsed by Tableau Software Tableau Your Data shows you how to build dynamic, best-of-breed visualizations using the Tableau Software toolset.
The Linux Command Line
William E. Shotts Jr. - 2012
Available here:readmeaway.com/download?i=1593279523The Linux Command Line, 2nd Edition: A Complete Introduction PDF by William ShottsRead The Linux Command Line, 2nd Edition: A Complete Introduction PDF from No Starch Press,William ShottsDownload William Shotts’s PDF E-book The Linux Command Line, 2nd Edition: A Complete Introduction
Don't Make Me Think, Revisited: A Common Sense Approach to Web Usability
Steve Krug - 2000
And it’s still short, profusely illustrated…and best of all–fun to read.If you’ve read it before, you’ll rediscover what made Don’t Make Me Think so essential to Web designers and developers around the world. If you’ve never read it, you’ll see why so many people have said it should be required reading for anyone working on Web sites.