Book picks similar to
Pandas Cookbook by Ted Petrou


data-science
python
programming
mathematics

R Programming for Data Science


Roger D. Peng - 2015
    

Structure and Interpretation of Computer Programs


Harold Abelson - 1984
    This long-awaited revision contains changes throughout the text. There are new implementations of most of the major programming systems in the book, including the interpreters and compilers, and the authors have incorporated many small changes that reflect their experience teaching the course at MIT since the first edition was published. A new theme has been introduced that emphasizes the central role played by different approaches to dealing with time in computational models: objects with state, concurrent programming, functional programming and lazy evaluation, and nondeterministic programming. There are new example sections on higher-order procedures in graphics and on applications of stream processing in numerical programming, and many new exercises. In addition, all the programs have been reworked to run in any Scheme implementation that adheres to the IEEE standard.

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

OS X 10.10 Yosemite: The Ars Technica Review


John Siracusa - 2014
    Siracusa's overview, wrap-up, and critique of everything new in OS X 10.10 Yosemite.

Service-Oriented Design with Ruby and Rails


Paul Dix - 2010
    Today, Rails developers and architects need better ways to interface with legacy systems, move into the cloud, and scale to handle higher volumes and greater complexity. In Service-Oriented Design with Ruby and Rails Paul Dix introduces a powerful, services-based design approach geared toward overcoming all these challenges. Using Dix's techniques, readers can leverage the full benefits of both Ruby and Rails, while overcoming the difficulties of working with larger codebases and teams. Dix demonstrates how to integrate multiple components within an enterprise application stack; create services that can easily grow and connect; and design systems that are easier to maintain and upgrade. Key concepts are explained with detailed Ruby code built using open source libraries such as ActiveRecord, Sinatra, Nokogiri, and Typhoeus. The book concludes with coverage of security, scaling, messaging, and interfacing with third-party services. Service-Oriented Design with Ruby and Rails will help you Build highly scalable, Ruby-based service architectures that operate smoothly in the cloud or with legacy systems Scale Rails systems to handle more requests, larger development teams, and more complex code bases Master new best practices for designing and creating services in Ruby Use Ruby to glue together services written in any language Use Ruby libraries to build and consume RESTful Web services Use Ruby JSON parsers to quickly represent resources from HTTP services Write lightweight, well-designed API wrappers around internal or external services Discover powerful non-Rails frameworks that simplify Ruby service implementation Implement standards-based enterprise messaging with Advanced Message Queuing Protocol (AMQP) Optimize performance with load balancing and caching Provide for security and authentication

Introduction to Algorithms


Thomas H. Cormen - 1989
    Each chapter is relatively self-contained and can be used as a unit of study. The algorithms are described in English and in a pseudocode designed to be readable by anyone who has done a little programming. The explanations have been kept elementary without sacrificing depth of coverage or mathematical rigor.

Python Programming: An Introduction to Computer Science


John Zelle - 2003
    It takes a fairly traditional approach, emphasizing problem solving, design, and programming as the core skills of computer science. However, these ideas are illustrated using a non-traditional language, namely Python. Although I use Python as the language, teaching Python is not the main point of this book. Rather, Python is used to illustrate fundamental principles of design and programming that apply in any language or computing environment. In some places, I have purposely avoided certain Python features and idioms that are not generally found in other languages. There are already many good books about Python on the market; this book is intended as an introduction to computing. Features include the following: *Extensive use of computer graphics. *Interesting examples. *Readable prose. *Flexible spiral coverage. *Just-in-time object coverage. *Extensive end-of-chapter problems.

Artificial Intelligence: A Modern Approach


Stuart Russell - 1994
    The long-anticipated revision of this best-selling text offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence. *NEW-Nontechnical learning material-Accompanies each part of the book. *NEW-The Internet as a sample application for intelligent systems-Added in several places including logical agents, planning, and natural language. *NEW-Increased coverage of material - Includes expanded coverage of: default reasoning and truth maintenance systems, including multi-agent/distributed AI and game theory; probabilistic approaches to learning including EM; more detailed descriptions of probabilistic inference algorithms. *NEW-Updated and expanded exercises-75% of the exercises are revised, with 100 new exercises. *NEW-On-line Java software. *Makes it easy for students to do projects on the web using intelligent agents. *A unified, agent-based approach to AI-Organizes the material around the task of building intelligent agents. *Comprehensive, up-to-date coverage-Includes a unified view of the field organized around the rational decision making pa

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Amazon Elastic Compute Cloud (EC2) User Guide


Amazon Web Services - 2012
    This is official Amazon Web Services (AWS) documentation for Amazon Compute Cloud (Amazon EC2).This guide explains the infrastructure provided by the Amazon EC2 web service, and steps you through how to configure and manage your virtual servers using the AWS Management Console (an easy-to-use graphical interface), the Amazon EC2 API, or web tools and utilities.Amazon EC2 provides resizable computing capacity—literally, server instances in Amazon's data centers—that you use to build and host your software systems.

Doing Data Science


Cathy O'Neil - 2013
    But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark


Holden Karau - 2017
    But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.With this book, you'll explore:How Spark SQL's new interfaces improve performance over SQL's RDD data structureThe choice between data joins in Core Spark and Spark SQLTechniques for getting the most out of standard RDD transformationsHow to work around performance issues in Spark's key/value pair paradigmWriting high-performance Spark code without Scala or the JVMHow to test for functionality and performance when applying suggested improvementsUsing Spark MLlib and Spark ML machine learning librariesSpark's Streaming components and external community packages

Learn Python 3 the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code (Zed Shaw's Hard Way Series)


Zed A. Shaw - 2017
    

Introduction to Machine Learning with Python: A Guide for Data Scientists


Andreas C. Müller - 2015
    If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.You'll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Muller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.With this book, you'll learn:Fundamental concepts and applications of machine learningAdvantages and shortcomings of widely used machine learning algorithmsHow to represent data processed by machine learning, including which data aspects to focus onAdvanced methods for model evaluation and parameter tuningThe concept of pipelines for chaining models and encapsulating your workflowMethods for working with text data, including text-specific processing techniquesSuggestions for improving your machine learning and data science skills

Doing Math with Python


Amit Saha - 2015
    Python is easy to learn, and it's perfect for exploring topics like statistics, geometry, probability, and calculus. You’ll learn to write programs to find derivatives, solve equations graphically, manipulate algebraic expressions, even examine projectile motion.Rather than crank through tedious calculations by hand, you'll learn how to use Python functions and modules to handle the number crunching while you focus on the principles behind the math. Exercises throughout teach fundamental programming concepts, like using functions, handling user input, and reading and manipulating data. As you learn to think computationally, you'll discover new ways to explore and think about math, and gain valuable programming skills that you can use to continue your study of math and computer science.If you’re interested in math but have yet to dip into programming, you’ll find that Python makes it easy to go deeper into the subject—let Python handle the tedious work while you spend more time on the math.