Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

The Art of Computer Programming, Volume 2: Seminumerical Algorithms


Donald Ervin Knuth - 1969
    -Byte, September 1995 I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me! I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up. -Charles Long If you think you're a really good programmer... read [Knuth's] Art of Computer Programming... You should definitely send me a resume if you can read the whole thing. -Bill Gates It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers. -Jonathan Laventhol The second volume offers a complete introduction to the field of seminumerical algorithms, with separate chapters on random numbers and arithmetic. The book summarizes the major paradigms and basic theory of such algorithms, thereby providing a comprehensive interface between computer programming and numerical analysis. Particularly noteworthy in this third edition is Knuth's new treatment of random number generators, and his discussion of calculations with formal power series. Ebook (PDF version) produced by Mathematical Sciences Publishers (MSP), http: //msp.org

Introduction to Algorithms


Thomas H. Cormen - 1989
    Each chapter is relatively self-contained and can be used as a unit of study. The algorithms are described in English and in a pseudocode designed to be readable by anyone who has done a little programming. The explanations have been kept elementary without sacrificing depth of coverage or mathematical rigor.

Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Getting Started with SQL: A Hands-On Approach for Beginners


Thomas Nield - 2016
    If you're a business or IT professional, this short hands-on guide teaches you how to pull and transform data with SQL in significant ways. You will quickly master the fundamentals of SQL and learn how to create your own databases.Author Thomas Nield provides exercises throughout the book to help you practice your newfound SQL skills at home, without having to use a database server environment. Not only will you learn how to use key SQL statements to find and manipulate your data, but you'll also discover how to efficiently design and manage databases to meet your needs.You'll also learn how to:Explore relational databases, including lightweight and centralized modelsUse SQLite and SQLiteStudio to create lightweight databases in minutesQuery and transform data in meaningful ways by using SELECT, WHERE, GROUP BY, and ORDER BYJoin tables to get a more complete view of your business dataBuild your own tables and centralized databases by using normalized design principlesManage data by learning how to INSERT, DELETE, and UPDATE records

Concrete Mathematics: A Foundation for Computer Science


Ronald L. Graham - 1988
    "More concretely," the authors explain, "it is the controlled manipulation of mathematical formulas, using a collection of techniques for solving problems."

Real World OCaml: Functional programming for the masses


Yaron Minsky - 2013
    Through the book’s many examples, you’ll quickly learn how OCaml stands out as a tool for writing fast, succinct, and readable systems code.Real World OCaml takes you through the concepts of the language at a brisk pace, and then helps you explore the tools and techniques that make OCaml an effective and practical tool. In the book’s third section, you’ll delve deep into the details of the compiler toolchain and OCaml’s simple and efficient runtime system.Learn the foundations of the language, such as higher-order functions, algebraic data types, and modulesExplore advanced features such as functors, first-class modules, and objectsLeverage Core, a comprehensive general-purpose standard library for OCamlDesign effective and reusable libraries, making the most of OCaml’s approach to abstraction and modularityTackle practical programming problems from command-line parsing to asynchronous network programmingExamine profiling and interactive debugging techniques with tools such as GNU gdb

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Linear Algebra Done Right


Sheldon Axler - 1995
    The novel approach taken here banishes determinants to the end of the book and focuses on the central goal of linear algebra: understanding the structure of linear operators on vector spaces. The author has taken unusual care to motivate concepts and to simplify proofs. For example, the book presents - without having defined determinants - a clean proof that every linear operator on a finite-dimensional complex vector space (or an odd-dimensional real vector space) has an eigenvalue. A variety of interesting exercises in each chapter helps students understand and manipulate the objects of linear algebra. This second edition includes a new section on orthogonal projections and minimization problems. The sections on self-adjoint operators, normal operators, and the spectral theorem have been rewritten. New examples and new exercises have been added, several proofs have been simplified, and hundreds of minor improvements have been made throughout the text.

Introduction to Computation and Programming Using Python


John V. Guttag - 2013
    It provides students with skills that will enable them to make productive use of computational techniques, including some of the tools and techniques of "data science" for using computation to model and interpret data. The book is based on an MIT course (which became the most popular course offered through MIT's OpenCourseWare) and was developed for use not only in a conventional classroom but in in a massive open online course (or MOOC) offered by the pioneering MIT--Harvard collaboration edX.Students are introduced to Python and the basics of programming in the context of such computational concepts and techniques as exhaustive enumeration, bisection search, and efficient approximation algorithms. The book does not require knowledge of mathematics beyond high school algebra, but does assume that readers are comfortable with rigorous thinking and not intimidated by mathematical concepts. Although it covers such traditional topics as computational complexity and simple algorithms, the book focuses on a wide range of topics not found in most introductory texts, including information visualization, simulations to model randomness, computational techniques to understand data, and statistical techniques that inform (and misinform) as well as two related but relatively advanced topics: optimization problems and dynamic programming.Introduction to Computation and Programming Using Python can serve as a stepping-stone to more advanced computer science courses, or as a basic grounding in computational problem solving for students in other disciplines.

How to Create a Mind: The Secret of Human Thought Revealed


Ray Kurzweil - 2012
    In How to Create a Mind, Kurzweil presents a provocative exploration of the most important project in human-machine civilization—reverse engineering the brain to understand precisely how it works and using that knowledge to create even more intelligent machines.Kurzweil discusses how the brain functions, how the mind emerges from the brain, and the implications of vastly increasing the powers of our intelligence in addressing the world’s problems. He thoughtfully examines emotional and moral intelligence and the origins of consciousness and envisions the radical possibilities of our merging with the intelligent technology we are creating.Certain to be one of the most widely discussed and debated science books of the year, How to Create a Mind is sure to take its place alongside Kurzweil’s previous classics which include Fantastic Voyage: Live Long Enough to Live Forever and The Age of Spiritual Machines.

R for Everyone: Advanced Analytics and Graphics


Jared P. Lander - 2013
    R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community

The Art of R Programming: A Tour of Statistical Software Design


Norman Matloff - 2011
    No statistical knowledge is required, and your programming skills can range from hobbyist to pro.Along the way, you'll learn about functional and object-oriented programming, running mathematical simulations, and rearranging complex data into simpler, more useful formats. You'll also learn to: Create artful graphs to visualize complex data sets and functions Write more efficient code using parallel R and vectorization Interface R with C/C++ and Python for increased speed or functionality Find new R packages for text analysis, image manipulation, and more Squash annoying bugs with advanced debugging techniques Whether you're designing aircraft, forecasting the weather, or you just need to tame your data, The Art of R Programming is your guide to harnessing the power of statistical computing.

Probabilistic Graphical Models: Principles and Techniques


Daphne Koller - 2009
    The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.

Making Games with Python & Pygame


Al Sweigart - 2012
    Each chapter gives you the complete source code for a new game and teaches the programming concepts from these examples. The book is available under a Creative Commons license and can be downloaded in full for free from http: //inventwithpython.com/pygame This book was written to be understandable by kids as young as 10 to 12 years old, although it is great for anyone of any age who has some familiarity with Python.