Book picks similar to
Data Science For Dummies by Lillian Pierson
data-science
data
business
science
JavaScript: The Definitive Guide
David Flanagan - 1996
This book is both an example-driven programmer's guide and a keep-on-your-desk reference, with new chapters that explain everything you need to know to get the most out of JavaScript, including:Scripted HTTP and Ajax XML processing Client-side graphics using the canvas tag Namespaces in JavaScript--essential when writing complex programs Classes, closures, persistence, Flash, and JavaScript embedded in Java applicationsPart I explains the core JavaScript language in detail. If you are new to JavaScript, it will teach you the language. If you are already a JavaScript programmer, Part I will sharpen your skills and deepen your understanding of the language.Part II explains the scripting environment provided by web browsers, with a focus on DOM scripting with unobtrusive JavaScript. The broad and deep coverage of client-side JavaScript is illustrated with many sophisticated examples that demonstrate how to:Generate a table of contents for an HTML document Display DHTML animations Automate form validation Draw dynamic pie charts Make HTML elements draggable Define keyboard shortcuts for web applications Create Ajax-enabled tool tips Use XPath and XSLT on XML documents loaded with Ajax And much morePart III is a complete reference for core JavaScript. It documents every class, object, constructor, method, function, property, and constant defined by JavaScript 1.5 and ECMAScript Version 3.Part IV is a reference for client-side JavaScript, covering legacy web browser APIs, the standard Level 2 DOM API, and emerging standards such as the XMLHttpRequest object and the canvas tag.More than 300,000 JavaScript programmers around the world have made this their indispensable reference book for building JavaScript applications."A must-have reference for expert JavaScript programmers...well-organized and detailed."-- Brendan Eich, creator of JavaScript
The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy
Sharon Bertsch McGrayne - 2011
To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok.In the first-ever account of Bayes' rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years—at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information (Alan Turing's role in breaking Germany's Enigma code during World War II), and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes' rule is used everywhere from DNA de-coding to Homeland Security.Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time.
Smarter Than You Think: How Technology is Changing Our Minds for the Better
Clive Thompson - 2013
But is it for the better? Amid a chorus of doomsayers, Clive Thompson delivers a resounding "yes." The Internet age has produced a radical new style of human intelligence, worthy of both celebration and analysis. We learn more and retain it longer, write and think with global audiences, and even gain an ESP-like awareness of the world around us. Modern technology is making us smarter, better connected, and often deeper—both as individuals and as a society. In Smarter Than You Think Thompson shows that every technological innovation—from the written word to the printing press to the telegraph—has provoked the very same anxieties that plague us today. We panic that life will never be the same, that our attentions are eroding, that culture is being trivialized. But as in the past, we adapt—learning to use the new and retaining what’s good of the old. Thompson introduces us to a cast of extraordinary characters who augment their minds in inventive ways. There's the seventy-six-year old millionaire who digitally records his every waking moment—giving him instant recall of the events and ideas of his life, even going back decades. There's a group of courageous Chinese students who mounted an online movement that shut down a $1.6 billion toxic copper plant. There are experts and there are amateurs, including a global set of gamers who took a puzzle that had baffled HIV scientists for a decade—and solved it collaboratively in only one month. Smarter Than You Think isn't just about pioneers. It's about everyday users of technology and how our digital tools—from Google to Twitter to Facebook and smartphones—are giving us new ways to learn, talk, and share our ideas. Thompson harnesses the latest discoveries in social science to explore how digital technology taps into our long-standing habits of mind—pushing them in powerful new directions. Our thinking will continue to evolve as newer tools enter our lives. Smarter Than You Think embraces and extols this transformation, presenting an exciting vision of the present and the future.
Feature Engineering for Machine Learning
Alice Zheng - 2018
With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.
Kingpin: How One Hacker Took Over the Billion-Dollar Cybercrime Underground
Kevin Poulsen - 2011
Max 'Vision' Butler was a white-hat hacker and a celebrity throughout the programming world, even serving as a consultant to the FBI. But there was another side to Max. As the black-hat 'Iceman', he'd seen the fraudsters around him squabble, their ranks riddled with infiltrators, their methods inefficient, and in their dysfunction was the ultimate challenge: he would stage a coup and steal their ill-gotten gains from right under their noses.Through the story of Max Butler's remarkable rise, KINGPIN lays bare the workings of a silent crime wave affecting millions worldwide. It exposes vast online-fraud supermarkets stocked with credit card numbers, counterfeit cheques, hacked bank accounts and fake passports. Thanks to Kevin Poulsen's remarkable access to both cops and criminals, we step inside the quiet,desperate battle that law enforcement fights against these scammers. And learn that the boy next door may not be all he seems.
Beautiful Code: Leading Programmers Explain How They Think
Andy OramLincoln Stein - 2007
You will be able to look over the shoulder of major coding and design experts to see problems through their eyes.This is not simply another design patterns book, or another software engineering treatise on the right and wrong way to do things. The authors think aloud as they work through their project's architecture, the tradeoffs made in its construction, and when it was important to break rules. Beautiful Code is an opportunity for master coders to tell their story. All author royalties will be donated to Amnesty International.
Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement
Eric Redmond - 2012
As a modern application developer you need to understand the emerging field of data management, both RDBMS and NoSQL. Seven Databases in Seven Weeks takes you on a tour of some of the hottest open source databases today. In the tradition of Bruce A. Tate's Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core each technology. Redis, Neo4J, CouchDB, MongoDB, HBase, Riak and Postgres. With each database, you'll tackle a real-world data problem that highlights the concepts and features that make it shine. You'll explore the five data models employed by these databases-relational, key/value, columnar, document and graph-and which kinds of problems are best suited to each. You'll learn how MongoDB and CouchDB are strikingly different, and discover the Dynamo heritage at the heart of Riak. Make your applications faster with Redis and more connected with Neo4J. Use MapReduce to solve Big Data problems. Build clusters of servers using scalable services like Amazon's Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that's more than the sum of its parts, or find one that meets all your needs at once.Seven Databases in Seven Weeks will take you on a deep dive into each of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.What You Need: To get the most of of this book you'll have to follow along, and that means you'll need a *nix shell (Mac OSX or Linux preferred, Windows users will need Cygwin), and Java 6 (or greater) and Ruby 1.8.7 (or greater). Each chapter will list the downloads required for that database.
Statistical Rethinking: A Bayesian Course with Examples in R and Stan
Richard McElreath - 2015
Reflecting the need for even minor programming in today's model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work.The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation.By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling.Web ResourceThe book is accompanied by an R package (rethinking) that is available on the author's website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.
Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers
T.J. O'Connor - 2012
Instead of relying on another attacker's tools, this book will teach you to forge your own weapons using the Python programming language. This book demonstrates how to write Python scripts to automate large-scale network attacks, extract metadata, and investigate forensic artifacts. It also shows how to write code to intercept and analyze network traffic using Python, craft and spoof wireless frames to attack wireless and Bluetooth devices, and how to data-mine popular social media websites and evade modern anti-virus.
Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference
Cameron Davidson-Pilon - 2014
However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power.
Bayesian Methods for Hackers
illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.
Web Scraping with Python: Collecting Data from the Modern Web
Ryan Mitchell - 2015
With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once.
Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.
Learn how to parse complicated HTML pages
Traverse multiple pages and sites
Get a general overview of APIs and how they work
Learn several methods for storing the data you scrape
Download, read, and extract data from documents
Use tools and techniques to clean badly formatted data
Read and write natural languages
Crawl through forms and logins
Understand how to scrape JavaScript
Learn image processing and text recognition
Hands-On Programming with R: Write Your Own Functions and Simulations
Garrett Grolemund - 2014
With this book, you'll learn how to load data, assemble and disassemble data objects, navigate R's environment system, write your own functions, and use all of R's programming tools.RStudio Master Instructor Garrett Grolemund not only teaches you how to program, but also shows you how to get more from R than just visualizing and modeling data. You'll gain valuable programming skills and support your work as a data scientist at the same time.Work hands-on with three practical data analysis projects based on casino gamesStore, retrieve, and change data values in your computer's memoryWrite programs and simulations that outperform those written by typical R usersUse R programming tools such as if else statements, for loops, and S3 classesLearn how to write lightning-fast vectorized R codeTake advantage of R's package system and debugging toolsPractice and apply R programming concepts as you learn them
R for Everyone: Advanced Analytics and Graphics
Jared P. Lander - 2013
R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community
Building Evolutionary Architectures: Support Constant Change
Neal Ford - 2017
Over the past few years, incremental developments in core engineering practices for software development have created the foundations for rethinking how architecture changes over time, along with ways to protect important architectural characteristics as it evolves. This practical guide ties those parts together with a new way to think about architecture and time.
Build a Career in Data Science
Emily Robinson - 2020
Industry experts Jacqueline Nolis and Emily Robinson lay out the soft skills you’ll need alongside your technical know-how in order to succeed in the field. Following their clear and simple instructions you’ll craft a resume that hiring managers will love, learn how to ace your interview, and ensure you hit the ground running in your first months at your new job. Once you’ve gotten your foot in the door, learn to thrive as a data scientist by handling high expectations, dealing with stakeholders, and managing failures. Finally, you’ll look towards the future and learn about how to join the broader data science community, leaving a job gracefully, and plotting your career path. With this book by your side you’ll have everything you need to ensure a rewarding and productive role in data science.