Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Applied Predictive Modeling


Max Kuhn - 2013
    Non- mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms. Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance-all of which are problems that occur frequently in practice. The text illustrates all parts of the modeling process through many hands-on, real-life examples. And every chapter contains extensive R code f

Learn Python 3 the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code (Zed Shaw's Hard Way Series)


Zed A. Shaw - 2017
    

HBR's 10 Must Reads on AI, Analytics, and the New Machine Age (with bonus article "Why Every Company Needs an Augmented Reality Strategy" by Michael E. Porter and James E. Heppelmann)


Harvard Business Review - 2018
    Is your company ready?If you read nothing else on how intelligent machines are revolutionizing business, read these 10 articles. We've combed through hundreds of Harvard Business Review articles and selected the most important ones to help you understand how these technologies work together, how to adopt them, and why your strategy can't ignore them. In this book you'll learn how: Data science, driven by artificial intelligence and machine learning, is yielding unprecedented business insights Blockchain has the potential to restructure the economy Drones and driverless vehicles are becoming essential tools 3-D printing is making new business models possible Augmented reality is transforming retail and manufacturing Smart speakers are redefining the rules of marketing Humans and machines are working together to reach new levels of productivity This collection of articles includes "Artificial Intelligence for the Real World," by Thomas H. Davenport and Rajeev Ronanki; "Stitch Fix's CEO on Selling Personal Style to the Mass Market," by Katrina Lake; "Algorithms Need Managers, Too," by Michael Luca, Jon Kleinberg, and Sendhil Mullainathan; "Marketing in the Age of Alexa," by Niraj Dawar; "Why Every Organization Needs an Augmented Reality Strategy," by Michael E. Porter and James E. Heppelmann; "Drones Go to Work," by Chris Anderson; "The Truth About Blockchain," by Marco Iansiti and Karim R. Lakhani; "The 3-D Printing Playbook," by Richard A. D’Aveni; "Collaborative Intelligence: Humans and AI Are Joining Forces," by H. James Wilson and Paul R. Daugherty; "When Your Boss Wears Metal Pants," by Walter Frick; and "Managing Our Hub Economy," by Marco Iansiti and Karim R. Lakhani.

Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)


Jiawei Han - 2000
    Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site

The Art of R Programming: A Tour of Statistical Software Design


Norman Matloff - 2011
    No statistical knowledge is required, and your programming skills can range from hobbyist to pro.Along the way, you'll learn about functional and object-oriented programming, running mathematical simulations, and rearranging complex data into simpler, more useful formats. You'll also learn to: Create artful graphs to visualize complex data sets and functions Write more efficient code using parallel R and vectorization Interface R with C/C++ and Python for increased speed or functionality Find new R packages for text analysis, image manipulation, and more Squash annoying bugs with advanced debugging techniques Whether you're designing aircraft, forecasting the weather, or you just need to tame your data, The Art of R Programming is your guide to harnessing the power of statistical computing.

Convex Optimization


Stephen Boyd - 2004
    A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. The focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. The text contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance, and economics.

Numerical Optimization


Jorge Nocedal - 2000
    One can trace its roots to the Calculus of Variations and the work of Euler and Lagrange. This natural and reasonable approach to mathematical programming covers numerical methods for finite-dimensional optimization problems. It begins with very simple ideas progressing through more complicated concepts, concentrating on methods for both unconstrained and constrained optimization.

Data Jujitsu: The Art of Turning Data into Product


D.J. Patil - 2012
    Acclaimed data scientist DJ Patil details a new approach to solving problems in Data Jujitsu.Learn how to use a problem's "weight" against itself to:Break down seemingly complex data problems into simplified partsUse alternative data analysis techniques to examine themUse human input, such as Mechanical Turk, and design tricks that enlist the help of your users to take short cuts around tough problemsLearn more about the problems before starting on the solutions—and use the findings to solve them, or determine whether the problems are worth solving at all.

Doing Bayesian Data Analysis: A Tutorial Introduction with R and BUGS


John K. Kruschke - 2010
    Included are step-by-step instructions on how to carry out Bayesian data analyses.Download Link : readbux.com/download?i=0124058884            0124058884 Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan PDF by John Kruschke

git commit murder


Michael Warren Lucas - 2017
    These few days will validate Dale Whitehead’s work—or expose him as a fraud. When a tragic death devastates the conference, only Dale suspects murder. Computer geeks care about code. But do they care enough… to kill?

Data Science


John D. Kelleher - 2018
    Today data science determines the ads we see online, the books and movies that are recommended to us online, which emails are filtered into our spam folders, and even how much we pay for health insurance. This volume in the MIT Press Essential Knowledge series offers a concise introduction to the emerging field of data science, explaining its evolution, current uses, data infrastructure issues, and ethical challenges.It has never been easier for organizations to gather, store, and process data. Use of data science is driven by the rise of big data and social media, the development of high-performance computing, and the emergence of such powerful methods for data analysis and modeling as deep learning. Data science encompasses a set of principles, problem definitions, algorithms, and processes for extracting non-obvious and useful patterns from large datasets. It is closely related to the fields of data mining and machine learning, but broader in scope. This book offers a brief history of the field, introduces fundamental data concepts, and describes the stages in a data science project. It considers data infrastructure and the challenges posed by integrating data from multiple sources, introduces the basics of machine learning, and discusses how to link machine learning expertise with real-world problems. The book also reviews ethical and legal issues, developments in data regulation, and computational approaches to preserving privacy. Finally, it considers the future impact of data science and offers principles for success in data science projects.

Doing Math with Python


Amit Saha - 2015
    Python is easy to learn, and it's perfect for exploring topics like statistics, geometry, probability, and calculus. You’ll learn to write programs to find derivatives, solve equations graphically, manipulate algebraic expressions, even examine projectile motion.Rather than crank through tedious calculations by hand, you'll learn how to use Python functions and modules to handle the number crunching while you focus on the principles behind the math. Exercises throughout teach fundamental programming concepts, like using functions, handling user input, and reading and manipulating data. As you learn to think computationally, you'll discover new ways to explore and think about math, and gain valuable programming skills that you can use to continue your study of math and computer science.If you’re interested in math but have yet to dip into programming, you’ll find that Python makes it easy to go deeper into the subject—let Python handle the tedious work while you spend more time on the math.

Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor


Virginia Eubanks - 2018
    In Pittsburgh, a child welfare agency uses a statistical model to try to predict which children might be future victims of abuse or neglect.Since the dawn of the digital age, decision-making in finance, employment, politics, health and human services has undergone revolutionary change. Today, automated systems—rather than humans—control which neighborhoods get policed, which families attain needed resources, and who is investigated for fraud. While we all live under this new regime of data, the most invasive and punitive systems are aimed at the poor.In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America. The book is full of heart-wrenching and eye-opening stories, from a woman in Indiana whose benefits are literally cut off as she lays dying to a family in Pennsylvania in daily fear of losing their daughter because they fit a certain statistical profile.The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute. Like the county poorhouse and scientific charity before them, digital tracking and automated decision-making hide poverty from the middle-class public and give the nation the ethical distance it needs to make inhumane choices: which families get food and which starve, who has housing and who remains homeless, and which families are broken up by the state. In the process, they weaken democracy and betray our most cherished national values.This deeply researched and passionate book could not be more timely.Naomi Klein: "This book is downright scary."Ethan Zuckerman, MIT: "Should be required reading."Dorothy Roberts, author of Killing the Black Body: "A must-read for everyone concerned about modern tools of inequality in America."Astra Taylor, author of The People's Platform: "This is the single most important book about technology you will read this year."

Embedded Android: Porting, Extending, and Customizing


Karim Yaghmour - 2011
    You'll also receive updates when significant changes are made, as well as the final ebook version. Embedded Android is for Developers wanting to create embedded systems based on Android and for those wanting to port Android to new hardware, or creating a custom development environment. Hackers and moders will also find this an indispensible guide to how Android works.