Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Computer Science Illuminated


Nell B. Dale - 2002
    Written By Two Of Today'S Most Respected Computer Science Educators, Nell Dale And John Lewis, The Text Provides A Broad Overview Of The Many Aspects Of The Discipline From A Generic View Point. Separate Program Language Chapters Are Available As Bundle Items For Those Instructors Who Would Like To Explore A Particular Programming Language With Their Students. The Many Layers Of Computing Are Thoroughly Explained Beginning With The Information Layer, Working Through The Hardware, Programming, Operating Systems, Application, And Communication Layers, And Ending With A Discussion On The Limitations Of Computing. Perfect For Introductory Computing And Computer Science Courses, Computer Science Illuminated, Third Edition's Thorough Presentation Of Computing Systems Provides Computer Science Majors With A Solid Foundation For Further Study, And Offers Non-Majors A Comprehensive And Complete Introduction To Computing.

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Statistics Done Wrong: The Woefully Complete Guide


Alex Reinhart - 2013
    Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.

Hands-On Machine Learning with Scikit-Learn and TensorFlow


Aurélien Géron - 2017
    Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details

How to Solve It: A New Aspect of Mathematical Method


George Pólya - 1944
    Polya, How to Solve It will show anyone in any field how to think straight. In lucid and appealing prose, Polya reveals how the mathematical method of demonstrating a proof or finding an unknown can be of help in attacking any problem that can be reasoned out--from building a bridge to winning a game of anagrams. Generations of readers have relished Polya's deft--indeed, brilliant--instructions on stripping away irrelevancies and going straight to the heart of the problem.

Python Tricks: A Buffet of Awesome Python Features


Dan Bader - 2017
    Discover the “hidden gold” in Python’s standard library and start writing clean and Pythonic code today. Who Should Read This Book: If you’re wondering which lesser known parts in Python you should know about, you’ll get a roadmap with this book. Discover cool (yet practical!) Python tricks and blow your coworkers’ minds in your next code review. If you’ve got experience with legacy versions of Python, the book will get you up to speed with modern patterns and features introduced in Python 3 and backported to Python 2. If you’ve worked with other programming languages and you want to get up to speed with Python, you’ll pick up the idioms and practical tips you need to become a confident and effective Pythonista. If you want to make Python your own and learn how to write clean and Pythonic code, you’ll discover best practices and little-known tricks to round out your knowledge. What Python Developers Say About The Book: "I kept thinking that I wished I had access to a book like this when I started learning Python many years ago." — Mariatta Wijaya, Python Core Developer"This book makes you write better Python code!" — Bob Belderbos, Software Developer at Oracle"Far from being just a shallow collection of snippets, this book will leave the attentive reader with a deeper understanding of the inner workings of Python as well as an appreciation for its beauty." — Ben Felder, Pythonista"It's like having a seasoned tutor explaining, well, tricks!" — Daniel Meyer, Sr. Desktop Administrator at Tesla Inc.

Python Machine Learning


Sebastian Raschka - 2015
    We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

OS X 10.10 Yosemite: The Ars Technica Review


John Siracusa - 2014
    Siracusa's overview, wrap-up, and critique of everything new in OS X 10.10 Yosemite.

Deep Learning with Python


François Chollet - 2017
    It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.

Python Crash Course: A Hands-On, Project-Based Introduction to Programming


Eric Matthes - 2015
    You'll also learn how to make your programs interactive and how to test your code safely before adding it to a project. In the second half of the book, you'll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, data visualizations with Python's super-handy libraries, and a simple web app you can deploy online.As you work through Python Crash Course, you'll learn how to: Use powerful Python libraries and tools, including matplotlib, NumPy, and PygalMake 2D games that respond to keypresses and mouse clicks, and that grow more difficult as the game progressesWork with data to generate interactive visualizationsCreate and customize simple web apps and deploy them safely onlineDeal with mistakes and errors so you can solve your own programming problemsIf you've been thinking seriously about digging into programming, Python Crash Course will get you up to speed and have you writing real programs fast. Why wait any longer? Start your engines and code!

Introduction to Machine Learning with Python: A Guide for Data Scientists


Andreas C. Müller - 2015
    If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.You'll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Muller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.With this book, you'll learn:Fundamental concepts and applications of machine learningAdvantages and shortcomings of widely used machine learning algorithmsHow to represent data processed by machine learning, including which data aspects to focus onAdvanced methods for model evaluation and parameter tuningThe concept of pipelines for chaining models and encapsulating your workflowMethods for working with text data, including text-specific processing techniquesSuggestions for improving your machine learning and data science skills