Book picks similar to
Think Stats by Allen B. Downey
data-science
statistics
programming
math
The Mythical Man-Month: Essays on Software Engineering
Frederick P. Brooks Jr. - 1975
With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."
Learning Perl
Randal L. Schwartz - 1993
Written by three prominent members of the Perl community who each have several years of experience teaching Perl around the world, this edition has been updated to account for all the recent changes to the language up to Perl 5.8.Perl is the language for people who want to get work done. It started as a tool for Unix system administrators who needed something powerful for small tasks. Since then, Perl has blossomed into a full-featured programming language used for web programming, database manipulation, XML processing, and system administration--on practically all platforms--while remaining the favorite tool for the small daily tasks it was designed for. You might start using Perl because you need it, but you'll continue to use it because you love it.Informed by their years of success at teaching Perl as consultants, the authors have re-engineered the Llama to better match the pace and scope appropriate for readers getting started with Perl, while retaining the detailed discussion, thorough examples, and eclectic wit for which the Llama is famous.The book includes new exercises and solutions so you can practice what you've learned while it's still fresh in your mind. Here are just some of the topics covered:Perl variable typessubroutinesfile operationsregular expressionstext processingstrings and sortingprocess managementusing third party modulesIf you ask Perl programmers today what book they relied on most when they were learning Perl, you'll find that an overwhelming majority will point to the Llama. With good reason. Other books may teach you to program in Perl, but this book will turn you into a Perl programmer.
Probability Theory: The Logic of Science
E.T. Jaynes - 1999
It discusses new results, along with applications of probability theory to a variety of problems. The book contains many exercises and is suitable for use as a textbook on graduate-level courses involving data analysis. Aimed at readers already familiar with applied mathematics at an advanced undergraduate level or higher, it is of interest to scientists concerned with inference from incomplete information.
Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale
Neha Narkhede - 2017
And how to move all of this data becomes nearly as important as the data itself. If you� re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds.Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you� ll learn Kafka� s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.Understand publish-subscribe messaging and how it fits in the big data ecosystem.Explore Kafka producers and consumers for writing and reading messagesUnderstand Kafka patterns and use-case requirements to ensure reliable data deliveryGet best practices for building data pipelines and applications with KafkaManage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasksLearn the most critical metrics among Kafka� s operational measurementsExplore how Kafka� s stream delivery capabilities make it a perfect source for stream processing systems
The Elements of Data Analytic Style
Jeffrey Leek - 2015
This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. The author is one of the co-developers of the Johns Hopkins Specialization in Data Science the largest data science program in the world that has enrolled more than 1.76 million people. The book is useful as a companion to introductory courses in data science or data analysis. It is also a useful reference tool for people tasked with reading and critiquing data analyses. It is based on the authors popular open-source guides available through his Github account (https://github.com/jtleek). The paper is also available through Leanpub (https://leanpub.com/datastyle), if the book is purchased on that platform you are entitled to lifetime free updates.
Machine Learning for Absolute Beginners
Oliver Theobald - 2017
The manner in which computers are now able to mimic human thinking is rapidly exceeding human capabilities in everything from chess to picking the winner of a song contest. In the age of machine learning, computers do not strictly need to receive an ‘input command’ to perform a task, but rather ‘input data’. From the input of data they are able to form their own decisions and take actions virtually as a human would. But as a machine, can consider many more scenarios and execute calculations to solve complex problems. This is the element that excites companies and budding machine learning engineers the most. The ability to solve complex problems never before attempted. This is also perhaps one reason why you are looking at purchasing this book, to gain a beginner's introduction to machine learning. This book provides a plain English introduction to the following topics: - Artificial Intelligence - Big Data - Downloading Free Datasets - Regression - Support Vector Machine Algorithms - Deep Learning/Neural Networks - Data Reduction - Clustering - Association Analysis - Decision Trees - Recommenders - Machine Learning Careers This book has recently been updated following feedback from readers. Version II now includes: - New Chapter: Decision Trees - Cleanup of minor errors
Doing Math with Python
Amit Saha - 2015
Python is easy to learn, and it's perfect for exploring topics like statistics, geometry, probability, and calculus. You’ll learn to write programs to find derivatives, solve equations graphically, manipulate algebraic expressions, even examine projectile motion.Rather than crank through tedious calculations by hand, you'll learn how to use Python functions and modules to handle the number crunching while you focus on the principles behind the math. Exercises throughout teach fundamental programming concepts, like using functions, handling user input, and reading and manipulating data. As you learn to think computationally, you'll discover new ways to explore and think about math, and gain valuable programming skills that you can use to continue your study of math and computer science.If you’re interested in math but have yet to dip into programming, you’ll find that Python makes it easy to go deeper into the subject—let Python handle the tedious work while you spend more time on the math.
Bayesian Data Analysis
Andrew Gelman - 1995
Its world-class authors provide guidance on all aspects of Bayesian data analysis and include examples of real statistical analyses, based on their own research, that demonstrate how to solve complicated problems. Changes in the new edition include:Stronger focus on MCMC Revision of the computational advice in Part III New chapters on nonlinear models and decision analysis Several additional applied examples from the authors' recent research Additional chapters on current models for Bayesian data analysis such as nonlinear models, generalized linear mixed models, and more Reorganization of chapters 6 and 7 on model checking and data collectionBayesian computation is currently at a stage where there are many reasonable ways to compute any given posterior distribution. However, the best approach is not always clear ahead of time. Reflecting this, the new edition offers a more pluralistic presentation, giving advice on performing computations from many perspectives while making clear the importance of being aware that there are different ways to implement any given iterative simulation computation. The new approach, additional examples, and updated information make Bayesian Data Analysis an excellent introductory text and a reference that working scientists will use throughout their professional life.
Algorithm Design
Jon Kleinberg - 2005
The book teaches a range of design and analysis techniques for problems that arise in computing applications. The text encourages an understanding of the algorithm design process and an appreciation of the role of algorithms in the broader field of computer science.
Hello World: Being Human in the Age of Algorithms
Hannah Fry - 2018
It’s time we stand face-to-digital-face with the true powers and limitations of the algorithms that already automate important decisions in healthcare, transportation, crime, and commerce. Hello World is indispensable preparation for the moral quandaries of a world run by code, and with the unfailingly entertaining Hannah Fry as our guide, we’ll be discussing these issues long after the last page is turned.
Information Dashboard Design: The Effective Visual Communication of Data
Stephen Few - 2006
Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.
All of Statistics: A Concise Course in Statistical Inference
Larry Wasserman - 2003
But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.
Interactive Data Visualization for the Web
Scott Murray - 2013
It’s easy and fun with this practical, hands-on introduction. Author Scott Murray teaches you the fundamental concepts and methods of D3, a JavaScript library that lets you express data visually in a web browser. Along the way, you’ll expand your web programming skills, using tools such as HTML and JavaScript.This step-by-step guide is ideal whether you’re a designer or visual artist with no programming experience, a reporter exploring the new frontier of data journalism, or anyone who wants to visualize and share data.Learn HTML, CSS, JavaScript, and SVG basicsDynamically generate web page elements from your data—and choose visual encoding rules to style themCreate bar charts, scatter plots, pie charts, stacked bar charts, and force-directed layoutsUse smooth, animated transitions to show changes in your dataIntroduce interactivity to help users explore data through different viewsCreate customized geographic maps with dataExplore hands-on with downloadable code and over 100 examples
R for Everyone: Advanced Analytics and Graphics
Jared P. Lander - 2013
R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community