Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Learn Python The Hard Way


Zed A. Shaw - 2010
    The title says it is the hard way to learn to writecode but it’s actually not. It’s the “hard” way only in that it’s the way people used to teach things. In this book youwill do something incredibly simple that all programmers actually do to learn a language: 1. Go through each exercise. 2. Type in each sample exactly. 3. Make it run.That’s it. This will be very difficult at first, but stick with it. If you go through this book, and do each exercise for1-2 hours a night, then you’ll have a good foundation for moving on to another book. You might not really learn“programming” from this book, but you will learn the foundation skills you need to start learning the language.This book’s job is to teach you the three most basic essential skills that a beginning programmer needs to know:Reading And Writing, Attention To Detail, Spotting Differences.

All of Statistics: A Concise Course in Statistical Inference


Larry Wasserman - 2003
    But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Introductory Statistics


Prem S. Mann - 2006
    The realistic content of its examples and exercises, the clarity and brevity of its presentation, and the soundness of its pedagogical approach have received the highest remarks from both students and instructors. Now this bestseller is available in a new 6th edition.

Elementary Statistics: A Step by Step Approach


Allan G. Bluman - 1992
    The book is non-theoretical, explaining concepts intuitively and teaching problem solving through worked examples and step-by-step instructions. This edition places more emphasis on conceptual understanding and understanding results. This edition also features increased emphasis on Excel, MINITAB, and the TI-83 Plus and TI 84-Plus graphing calculators, computing technologies commonly used in such courses.

The Data Detective: Ten Easy Rules to Make Sense of Statistics


Tim Harford - 2020
    That’s a mistake, Tim Harford says in The Data Detective. We shouldn’t be suspicious of statistics—we need to understand what they mean and how they can improve our lives: they are, at heart, human behavior seen through the prism of numbers and are often “the only way of grasping much of what is going on around us.” If we can toss aside our fears and learn to approach them clearly—understanding how our own preconceptions lead us astray—statistics can point to ways we can live better and work smarter.As “perhaps the best popular economics writer in the world” (New Statesman), Tim Harford is an expert at taking complicated ideas and untangling them for millions of readers. In The Data Detective, he uses new research in science and psychology to set out ten strategies for using statistics to erase our biases and replace them with new ideas that use virtues like patience, curiosity, and good sense to better understand ourselves and the world. As a result, The Data Detective is a big-idea book about statistics and human behavior that is fresh, unexpected, and insightful.

Visualize This: The FlowingData Guide to Design, Visualization, and Statistics


Nathan Yau - 2011
    Wouldn't it be wonderful if we could actually visualize data in such a way that we could maximize its potential and tell a story in a clear, concise manner? Thanks to the creative genius of Nathan Yau, we can. With this full-color book, data visualization guru and author Nathan Yau uses step-by-step tutorials to show you how to visualize and tell stories with data. He explains how to gather, parse, and format data and then design high quality graphics that help you explore and present patterns, outliers, and relationships.Presents a unique approach to visualizing and telling stories with data, from a data visualization expert and the creator of flowingdata.com, Nathan Yau Offers step-by-step tutorials and practical design tips for creating statistical graphics, geographical maps, and information design to find meaning in the numbers Details tools that can be used to visualize data-native graphics for the Web, such as ActionScript, Flash libraries, PHP, and JavaScript and tools to design graphics for print, such as R and Illustrator Contains numerous examples and descriptions of patterns and outliers and explains how to show them Visualize This demonstrates how to explain data visually so that you can present your information in a way that is easy to understand and appealing.

R for Everyone: Advanced Analytics and Graphics


Jared P. Lander - 2013
    R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community

The Algorithm Design Manual


Steven S. Skiena - 1997
    Drawing heavily on the author's own real-world experiences, the book stresses design and analysis. Coverage is divided into two parts, the first being a general guide to techniques for the design and analysis of computer algorithms. The second is a reference section, which includes a catalog of the 75 most important algorithmic problems. By browsing this catalog, readers can quickly identify what the problem they have encountered is called, what is known about it, and how they should proceed if they need to solve it. This book is ideal for the working professional who uses algorithms on a daily basis and has need for a handy reference. This work can also readily be used in an upper-division course or as a student reference guide. THE ALGORITHM DESIGN MANUAL comes with a CD-ROM that contains: * a complete hypertext version of the full printed book. * the source code and URLs for all cited implementations. * over 30 hours of audio lectures on the design and analysis of algorithms are provided, all keyed to on-line lecture notes.

Introduction to Information Retrieval


Christopher D. Manning - 2008
    Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

Even You Can Learn Statistics: A Guide for Everyone Who Has Ever Been Afraid of Statistics


David M. Levine - 2004
    Each technique is introduced with a simple, jargon-free explanation, practical examples, and hands-on guidance for solving real problems with Excel or a TI-83/84 series calculator, including Plus models. Hate math? No sweat. You'll be amazed how little you need! For those who do have an interest in mathematics, optional "Equation Blackboard" sections review the equations that provide the foundations for important concepts. David M. Levine is a much-honored innovator in statistics education. He is Professor Emeritus of Statistics and Computer Information Systems at Bernard M. Baruch College (CUNY), and co-author of several best-selling books, including Statistics for Managers using Microsoft Excel, Basic Business Statistics, Quality Management, and Six Sigma for Green Belts and Champions. Instructional designer David F. Stephan pioneered the classroom use of personal computers, and is a leader in making Excel more accessible to statistics students. He has co-authored several textbooks with David M. Levine. Here's just some of what you'll learn how to do... Use statistics in your everyday work or study Perform common statistical tasks using a Texas Instruments statistical calculator or Microsoft Excel Build and interpret statistical charts and tables "Test Yourself" at the end of each chapter to review the concepts and methods that you learned in the chapter Work with mean, median, mode, standard deviation, Z scores, skewness, and other descriptive statistics Use probability and probability distributions Work with sampling distributions and confidence intervals Test hypotheses and decision-making risks with Z, t, Chi-Square, ANOVA, and other techniques Perform regression analysis and modeling The easy, practical introduction to statistics--for everyone! Thought you couldn't learn statistics? Think again. You can--and you will!

Social Statistics for a Diverse Society


Chava Frankfort-Nachmias - 1996
    The authors help students learn key sociological concepts through real research examples related to the dynamic interplay of race, class, gender, and other social variables.

Concepts of Modern Mathematics


Ian Stewart - 1975
    Based on the abstract, general style of mathematical exposition favored by research mathematicians, its goal was to teach students not just to manipulate numbers and formulas, but to grasp the underlying mathematical concepts. The result, at least at first, was a great deal of confusion among teachers, students, and parents. Since then, the negative aspects of "new math" have been eliminated and its positive elements assimilated into classroom instruction.In this charming volume, a noted English mathematician uses humor and anecdote to illuminate the concepts underlying "new math": groups, sets, subsets, topology, Boolean algebra, and more. According to Professor Stewart, an understanding of these concepts offers the best route to grasping the true nature of mathematics, in particular the power, beauty, and utility of pure mathematics. No advanced mathematical background is needed (a smattering of algebra, geometry, and trigonometry is helpful) to follow the author's lucid and thought-provoking discussions of such topics as functions, symmetry, axiomatics, counting, topology, hyperspace, linear algebra, real analysis, probability, computers, applications of modern mathematics, and much more.By the time readers have finished this book, they'll have a much clearer grasp of how modern mathematicians look at figures, functions, and formulas and how a firm grasp of the ideas underlying "new math" leads toward a genuine comprehension of the nature of mathematics itself.

Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering


Steven H. Strogatz - 1994
    The presentation stresses analytical methods, concrete examples, and geometric intuition. A unique feature of the book is its emphasis on applications. These include mechanical vibrations, lasers, biological rhythms, superconducting circuits, insect outbreaks, chemical oscillators, genetic control systems, chaotic waterwheels, and even a technique for using chaos to send secret messages. In each case, the scientific background is explained at an elementary level and closely integrated with mathematical theory.About the Author:Steven Strogatz is in the Center for Applied Mathematics and the Department of Theoretical and Applied Mathematics at Cornell University. Since receiving his Ph.D. from Harvard university in 1986, Professor Strogatz has been honored with several awards, including the E.M. Baker Award for Excellence, the highest teaching award given by MIT.