Doing Data Science


Cathy O'Neil - 2013
    But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Cuda by Example: An Introduction to General-Purpose Gpu Programming


Jason Sanders - 2010
    " From the Foreword by Jack Dongarra, University of Tennessee and Oak Ridge National Laboratory CUDA is a computing architecture designed to facilitate the development of parallel programs. In conjunction with a comprehensive software platform, the CUDA Architecture enables programmers to draw on the immense power of graphics processing units (GPUs) when building high-performance applications. GPUs, of course, have long been available for demanding graphics and game applications. CUDA now brings this valuable resource to programmers working on applications in other domains, including science, engineering, and finance. No knowledge of graphics programming is required just the ability to program in a modestly extended version of C. " CUDA by Example, " written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details the techniques and trade-offs associated with each key CUDA feature. You ll discover when to use each CUDA C extension and how to write CUDA software that delivers truly outstanding performance. Major topics covered includeParallel programmingThread cooperationConstant memory and eventsTexture memoryGraphics interoperabilityAtomicsStreamsCUDA C on multiple GPUsAdvanced atomicsAdditional CUDA resources All the CUDA software tools you ll need are freely available for download from NVIDIA.http: //developer.nvidia.com/object/cuda-by-e...

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

All of Statistics: A Concise Course in Statistical Inference


Larry Wasserman - 2003
    But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.

MATLAB: An Introduction with Applications


Amos Gilat - 2003
    The first chapter describes basic features of the program and shows how to use it in simple arithmetic operations with scalars. The next two chapters focus on the topic of arrays (the basis of MATLAB), while the remaining text covers a wide range of other applications. Computer screens, tutorials, samples, and homework questions in math, science, and engineering, provide the student with the practical hands-on experience needed for total proficiency.

Digital Gold: Bitcoin and the Inside Story of the Misfits and Millionaires Trying to Reinvent Money


Nathaniel Popper - 2015
    Believers from Beijing to Buenos Aires see the potential for a financial system free from banks and governments. More than just a tech industry fad, Bitcoin has threatened to decentralize some of society’s most basic institutions.An unusual tale of group invention, Digital Gold charts the rise of the Bitcoin technology through the eyes of the movement’s colorful central characters, including an Argentinian millionaire, a Chinese entrepreneur, Tyler and Cameron Winklevoss, and Bitcoin’s elusive creator, Satoshi Nakamoto. Already, Bitcoin has led to untold riches for some, and prison terms for others.

Numerical Optimization


Jorge Nocedal - 2000
    One can trace its roots to the Calculus of Variations and the work of Euler and Lagrange. This natural and reasonable approach to mathematical programming covers numerical methods for finite-dimensional optimization problems. It begins with very simple ideas progressing through more complicated concepts, concentrating on methods for both unconstrained and constrained optimization.

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World


Pedro Domingos - 2015
    In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

Advanced Engineering Mathematics


Dennis G. Zill - 1992
    A Key Strength Of This Text Is Zill'S Emphasis On Differential Equations As Mathematical Models, Discussing The Constructs And Pitfalls Of Each. The Third Edition Is Comprehensive, Yet Flexible, To Meet The Unique Needs Of Various Course Offerings Ranging From Ordinary Differential Equations To Vector Calculus. Numerous New Projects Contributed By Esteemed Mathematicians Have Been Added. Key Features O The Entire Text Has Been Modernized To Prepare Engineers And Scientists With The Mathematical Skills Required To Meet Current Technological Challenges. O The New Larger Trim Size And 2-Color Design Make The Text A Pleasure To Read And Learn From. O Numerous NEW Engineering And Science Projects Contributed By Top Mathematicians Have Been Added, And Are Tied To Key Mathematical Topics In The Text. O Divided Into Five Major Parts, The Text'S Flexibility Allows Instructors To Customize The Text To Fit Their Needs. The First Eight Chapters Are Ideal For A Complete Short Course In Ordinary Differential Equations. O The Gram-Schmidt Orthogonalization Process Has Been Added In Chapter 7 And Is Used In Subsequent Chapters. O All Figures Now Have Explanatory Captions. Supplements O Complete Instructor'S Solutions: Includes All Solutions To The Exercises Found In The Text. Powerpoint Lecture Slides And Additional Instructor'S Resources Are Available Online. O Student Solutions To Accompany Advanced Engineering Mathematics, Third Edition: This Student Supplement Contains The Answers To Every Third Problem In The Textbook, Allowing Students To Assess Their Progress And Review Key Ideas And Concepts Discussed Throughout The Text. ISBN: 0-7637-4095-0

Raspberry Pi Cookbook


Simon Monk - 2013
    In this cookbook, prolific hacker and author Simon Monk provides more than 200 practical recipes for running this tiny low-cost computer with Linux, programming it with Python, and hooking up sensors, motors, and other hardware—including Arduino.You’ll also learn basic principles to help you use new technologies with Raspberry Pi as its ecosystem develops. Python and other code examples from the book are available on GitHub. This cookbook is ideal for programmers and hobbyists familiar with the Pi through resources such as Getting Started with Raspberry Pi (O’Reilly).Set up and manage your Raspberry PiConnect the Pi to a networkWork with its Linux-based operating systemUse the Pi’s ready-made softwareProgram Raspberry Pi with PythonControl hardware through the GPIO connectorUse Raspberry Pi to run different types of motorsWork with switches, keypads, and other digital inputsHook up sensors for taking various measurementsAttach different displays, such as an LED matrixCreate dynamic projects with Raspberry Pi and Arduino Make sure to check out 10 of the over 60 video recipes for this book at: http://razzpisampler.oreilly.com/ You can purchase all recipes at:

SSH, The Secure Shell: The Definitive Guide


Daniel J. Barrett - 2001
    It supports secure remote logins, secure file transfer between computers, and a unique "tunneling" capability that adds encryption to otherwise insecure network applications. Best of all, SSH is free, with feature-filled commercial versions available as well.SSH: The Secure Shell: The Definitive Guide covers the Secure Shell in detail for both system administrators and end users. It demystifies the SSH man pages and includes thorough coverage of:SSH1, SSH2, OpenSSH, and F-Secure SSH for Unix, plus Windows and Macintosh products: the basics, the internals, and complex applications.Configuring SSH servers and clients, both system-wide and per user, with recommended settings to maximize security.Advanced key management using agents, agent forwarding, and forced commands.Forwarding (tunneling) of TCP and X11 applications in depth, even in the presence of firewalls and network address translation (NAT).Undocumented behaviors of popular SSH implementations.Installing and maintaining SSH systems.Whether you're communicating on a small LAN or across the Internet, SSH can ship your data from "here" to "there" efficiently and securely. So throw away those insecure .rhosts and hosts.equiv files, move up to SSH, and make your network a safe place to live and work.

Computers and Intractability: A Guide to the Theory of NP-Completeness


Michael R. Garey - 1979
    Johnson. It was the first book exclusively on the theory of NP-completeness and computational intractability. The book features an appendix providing a thorough compendium of NP-complete problems (which was updated in later printings of the book). The book is now outdated in some respects as it does not cover more recent development such as the PCP theorem. It is nevertheless still in print and is regarded as a classic: in a 2006 study, the CiteSeer search engine listed the book as the most cited reference in computer science literature.

Learning From Data: A Short Course


Yaser S. Abu-Mostafa - 2012
    Its techniques are widely applied in engineering, science, finance, and commerce. This book is designed for a short course on machine learning. It is a short course, not a hurried course. From over a decade of teaching this material, we have distilled what we believe to be the core topics that every student of the subject should know. We chose the title `learning from data' that faithfully describes what the subject is about, and made it a point to cover the topics in a story-like fashion. Our hope is that the reader can learn all the fundamentals of the subject by reading the book cover to cover. ---- Learning from data has distinct theoretical and practical tracks. In this book, we balance the theoretical and the practical, the mathematical and the heuristic. Our criterion for inclusion is relevance. Theory that establishes the conceptual framework for learning is included, and so are heuristics that impact the performance of real learning systems. ---- Learning from data is a very dynamic field. Some of the hot techniques and theories at times become just fads, and others gain traction and become part of the field. What we have emphasized in this book are the necessary fundamentals that give any student of learning from data a solid foundation, and enable him or her to venture out and explore further techniques and theories, or perhaps to contribute their own. ---- The authors are professors at California Institute of Technology (Caltech), Rensselaer Polytechnic Institute (RPI), and National Taiwan University (NTU), where this book is the main text for their popular courses on machine learning. The authors also consult extensively with financial and commercial companies on machine learning applications, and have led winning teams in machine learning competitions.

Probabilistic Graphical Models: Principles and Techniques


Daphne Koller - 2009
    The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.