Book picks similar to
Introduction to Data Science by Rafael A. Irizarry
data-science
data-science-ethics-analytics-viz
partially-read
programming
The Fourth Paradigm: Data-Intensive Scientific Discovery
Tony Hey - 2009
Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.
Learn R in a Day
Steven Murray - 2013
The book assumes no prior knowledge of computer programming and progressively covers all the essential steps needed to become confident and proficient in using R within a day. Topics include how to input, manipulate, format, iterate (loop), query, perform basic statistics on, and plot data, via a step-by-step technique and demonstrations using in-built datasets which the reader is encouraged to replicate on their computer. Each chapter also includes exercises (with solutions) to practice key skills and empower the reader to build on the essentials gained during this introductory course.
Data Smart: Using Data Science to Transform Information into Insight
John W. Foreman - 2013
Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.
Computational Complexity
Sanjeev Arora - 2007
Requiring essentially no background apart from mathematical maturity, the book can be used as a reference for self-study for anyone interested in complexity, including physicists, mathematicians, and other scientists, as well as a textbook for a variety of courses and seminars. More than 300 exercises are included with a selected hint set.
The Algorithm Design Manual
Steven S. Skiena - 1997
Drawing heavily on the author's own real-world experiences, the book stresses design and analysis. Coverage is divided into two parts, the first being a general guide to techniques for the design and analysis of computer algorithms. The second is a reference section, which includes a catalog of the 75 most important algorithmic problems. By browsing this catalog, readers can quickly identify what the problem they have encountered is called, what is known about it, and how they should proceed if they need to solve it. This book is ideal for the working professional who uses algorithms on a daily basis and has need for a handy reference. This work can also readily be used in an upper-division course or as a student reference guide. THE ALGORITHM DESIGN MANUAL comes with a CD-ROM that contains: * a complete hypertext version of the full printed book. * the source code and URLs for all cited implementations. * over 30 hours of audio lectures on the design and analysis of algorithms are provided, all keyed to on-line lecture notes.
Programming Pearls
Jon L. Bentley - 1986
Jon has done a wonderful job of updating the material. I am very impressed at how fresh the new examples seem." - Steve McConnell, author, Code CompleteWhen programmers list their favorite books, Jon Bentley's collection of programming pearls is commonly included among the classics. Just as natural pearls grow from grains of sand that irritate oysters, programming pearls have grown from real problems that have irritated real programmers. With origins beyond solid engineering, in the realm of insight and creativity, Bentley's pearls offer unique and clever solutions to those nagging problems. Illustrated by programs designed as much for fun as for instruction, the book is filled with lucid and witty descriptions of practical programming techniques and fundamental design principles. It is not at all surprising that
Programming Pearls
has been so highly valued by programmers at every level of experience. In this revision, the first in 14 years, Bentley has substantially updated his essays to reflect current programming methods and environments. In addition, there are three new essays on (1) testing, debugging, and timing; (2) set representations; and (3) string problems. All the original programs have been rewritten, and an equal amount of new code has been generated. Implementations of all the programs, in C or C++, are now available on the Web.What remains the same in this new edition is Bentley's focus on the hard core of programming problems and his delivery of workable solutions to those problems. Whether you are new to Bentley's classic or are revisiting his work for some fresh insight, this book is sure to make your own list of favorites.
Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference
Cameron Davidson-Pilon - 2014
However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power.
Bayesian Methods for Hackers
illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.
Think Stats
Allen B. Downey - 2011
This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data
97 Things Every Programmer Should Know: Collective Wisdom from the Experts
Kevlin Henney - 2010
With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan
Beginning Web Programming with HTML, XHTML and CSS
Jon Duckett - 2004
It follows standards-based principles, but also teaches readers ways around problems they are likely to face using (X)HTML.While XHTML is the "current" standard, the book still covers HTML because many people do not yet understand that XHTML is the official successor to HTML, and many readers will still stick with HTML for backward compatibility and simpler/informal Web pages that don't require XHTML compliance.The book teaches basic principles of usability and accessibility along the way, to get users into the mode of developing Web pages that will be available to as many viewers as possible from the start. The book also covers the most commonly used programming/scripting language -- JavaScript -- and provides readers with a roadmap of other Web technologies to learn after mastering this book to add more functionality to their sites.
Algorithmic Puzzles
Anany V. Levitin - 2011
This logic extends far beyond the realm of computer science and into the wide and entertaining world of puzzles. In Algorithmic Puzzles, Anany and Maria Levitin use many classic brainteasers as well as newer examples from job interviews with major corporations to show readers how to apply analytical thinking to solve puzzles requiring well-defined procedures.The book's unique collection of puzzles is supplemented with carefully developed tutorials on algorithm design strategies and analysis techniques intended to walk the reader step-by-step through the various approaches to algorithmic problem solving. Mastery of these strategies--exhaustive search, backtracking, and divide-and-conquer, among others--will aid the reader in solving not only the puzzles contained in this book, but also others encountered in interviews, puzzle collections, and throughout everyday life. Each of the 150 puzzles contains hints and solutions, along with commentary onthe puzzle's origins and solution methods. The only book of its kind, Algorithmic Puzzles houses puzzles for all skill levels. Readers with only middle school mathematics will develop their algorithmic problem-solving skills through puzzles at the elementary level, while seasoned puzzle solvers will enjoy the challenge of thinking throughmore difficult puzzles.
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Rob Collie - 2016
Written by the world’s foremost PowerPivot blogger and practitioner, the book’s concepts and approach are introduced in a simple, step-by-step manner tailored to the learning style of Excel users everywhere. The techniques presented allow users to produce, in hours or even minutes, results that formerly would have taken entire teams weeks or months to produce. It includes lessons on the difference between calculated columns and measures; how formulas can be reused across reports of completely different shapes; how to merge disjointed sets of data into unified reports; how to make certain columns in a pivot behave as if the pivot were filtered while other columns do not; and how to create time-intelligent calculations in pivot tables such as “Year over Year” and “Moving Averages” whether they use a standard, fiscal, or a complete custom calendar. The “pattern-like” techniques and best practices contained in this book have been developed and refined over two years of onsite training with Excel users around the world, and the key lessons from those seminars costing thousands of dollars per day are now available to within the pages of this easy-to-follow guide. This updated second edition covers new features introduced with Office 2015.
Fun Stories Greatest Hits: The short story humor book packed with 40 real-life comedy adventures.
R. Scott Murphy - 2019
Flip-flops are optional. All of your favorite hilarious Fun Stories gathered together in one place! Plus, extended versions of many of your laugh out loud favorites. PRAISE FOR FUN STORIES: “The offbeat & heartwarming humor kept me turning pages due to its rollicking adventures in every chapter. Your friends, loved ones, and co-workers will enjoy it too." "It's very funny and on point observational humor” “Has me laughing from beginning to end” "Similar to Dave Barry, Eric Idle, and Stephen Colbert's humor" “This guy's funny and charmingly off the wall” "Always thought coffee could get me through the midweek blues, but this blows it out of the water" "I love Fun Stories. Fantastic!" "Awesome stuff, really enjoyed the horse names. Great read." "Excellent and hilarious material here" INCLUDES ALL OF THESE AND MANY MORE: "Chick-fil-A Makes Me Feel Like Leonardo DiCaprio" "The Least Amount of Fame Possible (Old MacDonald)" "Cub Scout Dropout" "Not the Next Carrie Underwood" "Bigfoot Popcorn" "Gatorade For Your Soul" "Shamelessly Suggestive City Names" "I'm the Freakin' Michael Phelps of Googling" "Alright, Alright, Alright!" "Mind Game of Thrones" "Happy Friday (Mr. Pee Man)" "Clown Commuter Award" "How NASA Thins The Herd" "Crunchy Roads, Take Me Home" "Good Folks, Bad Coaching" "Ultimate Waitress Revenge" "Battle of the Bands" You'll be smiling for hours as the pages turn themselves. Three Fun Stories books have hit #1 on Amazon Humor. Five audio tracks from this book have hit the top 10 on the iTunes comedy chart. Click to download the book and start the fun! Need more info? If you enjoy the humor of these folks, you'll love reading Fun Stories Greatest Hits: Dave Barry, Dilbert, David Sedaris, Stephen Colbert, Carl Hiaasen, Christopher Moore, Jim Gaffigan, Tina Fey, Amy Poehler, Jerry Seinfeld, Eric Idle, Erma Bombeck, Sebastian Maniscalco, Mindy Kaling, Jen Lancaster, Amy Lyle, The Far Side, Kevin Allison, Rainn Wilson, B.J. Novak, Sloane Crosley, Jimmy Fallon, Steve Martin, Judd Apatow, Adam Mansbach, Justin Halpern, Reader’s Digest, Dad Jokes, Garfield, Saturday Night Live, Friends, and The Office. Download Fun Stories Greatest Hits to start your next comedy adventure! Enjoy Fun Stories video posts at www.facebook.com/FunStoriesSeries. Get a free Fun Pack at www.mentalkickball.com. HUMOR, HUMOR & ENTERTAINMENT, PARENTING & FAMILIES, PARODIES, SATIRE, ESSAYS, SHORT STORIES, CELEBRITY & POPULAR CULTURE, COMEDY, CULTURAL, ETHNIC & REGIONAL, CARTOONS, JOKES & RIDDLES, MEN, WOMEN & RELATIONSHIPS, TRAVEL, POP CULTURE, RADIO, TELEVISION, PERFORMING ARTS, FUNNY BOOKS, HUMOR ESSAYS, FUNNY SHORT STORIES, FUNNY MEMOIR, HUMOR MEMOIR, FUNNY BOOK CLUB BOOKS, FUNNY ESSAYS, HUMOR BOOKS, HUMOROUS BOOKS, GIFTS FOR DAD, FUNNY GIFTS, LAUGH, LAUGH OUT LOUD, FUN, FUN, FUN, MORE COWBELL COWBELL
Fluent Python: Clear, Concise, and Effective Programming
Luciano Ramalho - 2015
With this hands-on guide, you'll learn how to write effective, idiomatic Python code by leveraging its best and possibly most neglected features. Author Luciano Ramalho takes you through Python's core language features and libraries, and shows you how to make your code shorter, faster, and more readable at the same time.Many experienced programmers try to bend Python to fit patterns they learned from other languages, and never discover Python features outside of their experience. With this book, those Python programmers will thoroughly learn how to become proficient in Python 3.This book covers:Python data model: understand how special methods are the key to the consistent behavior of objectsData structures: take full advantage of built-in types, and understand the text vs bytes duality in the Unicode ageFunctions as objects: view Python functions as first-class objects, and understand how this affects popular design patternsObject-oriented idioms: build classes by learning about references, mutability, interfaces, operator overloading, and multiple inheritanceControl flow: leverage context managers, generators, coroutines, and concurrency with the concurrent.futures and asyncio packagesMetaprogramming: understand how properties, attribute descriptors, class decorators, and metaclasses work"
Practical Statistics for Data Scientists: 50 Essential Concepts
Peter Bruce - 2017
Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data