Book picks similar to
Pandas Cookbook by Ted Petrou


data-science
python
programming
long-list

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement


Eric Redmond - 2012
    As a modern application developer you need to understand the emerging field of data management, both RDBMS and NoSQL. Seven Databases in Seven Weeks takes you on a tour of some of the hottest open source databases today. In the tradition of Bruce A. Tate's Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core each technology. Redis, Neo4J, CouchDB, MongoDB, HBase, Riak and Postgres. With each database, you'll tackle a real-world data problem that highlights the concepts and features that make it shine. You'll explore the five data models employed by these databases-relational, key/value, columnar, document and graph-and which kinds of problems are best suited to each. You'll learn how MongoDB and CouchDB are strikingly different, and discover the Dynamo heritage at the heart of Riak. Make your applications faster with Redis and more connected with Neo4J. Use MapReduce to solve Big Data problems. Build clusters of servers using scalable services like Amazon's Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that's more than the sum of its parts, or find one that meets all your needs at once.Seven Databases in Seven Weeks will take you on a deep dive into each of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.What You Need: To get the most of of this book you'll have to follow along, and that means you'll need a *nix shell (Mac OSX or Linux preferred, Windows users will need Cygwin), and Java 6 (or greater) and Ruby 1.8.7 (or greater). Each chapter will list the downloads required for that database.

Common LISP: A Gentle Introduction to Symbolic Computation


David S. Touretzky - 1989
    A LISP "toolkit" in each chapter explains how to use Common LISP programming and debugging tools such as DESCRIBE, INSPECT, TRACE and STEP.

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference


Cameron Davidson-Pilon - 2014
    However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.

Becoming a Better Programmer


Pete Goodliffe - 2014
    Code Craft author Pete Goodliffe presents a collection of useful techniques and approaches to the art and craft of programming that will help boost your career and your well-being.Goodliffe presents sound advice that he's learned in 15 years of professional programming. The book's standalone chapters span the range of a software developer's life--dealing with code, learning the trade, and improving performance--with no language or industry bias. Whether you're a seasoned developer, a neophyte professional, or a hobbyist, you'll find valuable tips in five independent categories:Code-level techniques for crafting lines of code, testing, debugging, and coping with complexityPractices, approaches, and attitudes: keep it simple, collaborate well, reuse, and create malleable codeTactics for learning effectively, behaving ethically, finding challenges, and avoiding stagnationPractical ways to complete things: use the right tools, know what "done" looks like, and seek help from colleaguesHabits for working well with others, and pursuing development as a social activity

Big Data: A Revolution That Will Transform How We Live, Work, and Think


Viktor Mayer-Schönberger - 2013
    “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com

The Haskell Road to Logic, Maths and Programming


Kees Doets - 2004
    Haskell emerged in the last decade as a standard for lazy functional programming, a programming style where arguments are evaluated only when the value is actually needed. Haskell is a marvellous demonstration tool for logic and maths because its functional character allows implementations to remain very close to the concepts that get implemented, while the laziness permits smooth handling of infinite data structures.This book does not assume the reader to have previous experience with either programming or construction of formal proofs, but acquaintance with mathematical notation, at the level of secondary school mathematics is presumed. Everything one needs to know about mathematical reasoning or programming is explained as we go along. After proper digestion of the material in this book the reader will be able to write interesting programs, reason about their correctness, and document them in a clear fashion. The reader will also have learned how to set up mathematical proofs in a structured way, and how to read and digest mathematical proofs written by others.

Introducing Elixir: Getting Started in Functional Programming


Simon St.Laurent - 2013
    If you're new to Elixir, its functional style can seem difficult, but with help from this hands-on introduction, you'll scale the learning curve and discover how enjoyable, powerful, and fun this language can be. Elixir combines the robust functional programming of Erlang with an approach that looks more like Ruby and reaches toward metaprogramming with powerful macro features.Authors Simon St. Laurent and J. David Eisenberg show you how to write simple Elixir programs by teaching you one skill at a time. You’ll learn about pattern matching, recursion, message passing, process-oriented programming, and establishing pathways for data rather than telling it where to go. By the end of your journey, you’ll understand why Elixir is ideal for concurrency and resilience.* Get comfortable with IEx, Elixir's command line interface* Become familiar with Elixir’s basic structures by working with numbers* Discover atoms, pattern matching, and guards: the foundations of your program structure* Delve into the heart of Elixir processing with recursion, strings, lists, and higher-order functions* Create processes, send messages among them, and apply pattern matching to incoming messages* Store and manipulate structured data with Erlang Term * Storage (ETS) and the Mnesia database* Build resilient applications with the Open Telecom Platform (OTP)* Define macros with Elixir's meta-programming tools.

HBase: The Definitive Guide


Lars George - 2011
    As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away. Discover how tight integration with Hadoop makes scalability with HBase easier Distribute large datasets across an inexpensive cluster of commodity servers Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks

Web Scraping with Python: Collecting Data from the Modern Web


Ryan Mitchell - 2015
    With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice. Learn how to parse complicated HTML pages Traverse multiple pages and sites Get a general overview of APIs and how they work Learn several methods for storing the data you scrape Download, read, and extract data from documents Use tools and techniques to clean badly formatted data Read and write natural languages Crawl through forms and logins Understand how to scrape JavaScript Learn image processing and text recognition

Algorithms of the Intelligent Web


Haralambos Marmanis - 2009
    They use powerful techniques to process information intelligently and offer features based on patterns and relationships in data. Algorithms of the Intelligent Web shows readers how to use the same techniques employed by household names like Google Ad Sense, Netflix, and Amazon to transform raw data into actionable information.Algorithms of the Intelligent Web is an example-driven blueprint for creating applications that collect, analyze, and act on the massive quantities of data users leave in their wake as they use the web. Readers learn to build Netflix-style recommendation engines, and how to apply the same techniques to social-networking sites. See how click-trace analysis can result in smarter ad rotations. All the examples are designed both to be reused and to illustrate a general technique- an algorithm-that applies to a broad range of scenarios.As they work through the book's many examples, readers learn about recommendation systems, search and ranking, automatic grouping of similar objects, classification of objects, forecasting models, and autonomous agents. They also become familiar with a large number of open-source libraries and SDKs, and freely available APIs from the hottest sites on the internet, such as Facebook, Google, eBay, and Yahoo.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Programming Groovy


Venkat Subramaniam - 2008
    But recently, the industry has turned to dynamic languages for increased productivity and speed to market.Groovy is one of a new breed of dynamic languages that run on the Java platform. You can use these new languages on the JVM and intermix them with your existing Java code. You can leverage your Java investments while benefiting from advanced features including true Closures, Meta Programming, the ability to create internal DSLs, and a higher level of abstraction.If you're an experienced Java developer, Programming Groovy will help you learn the necessary fundamentals of programming in Groovy. You'll see how to use Groovy to do advanced programming including using Meta Programming, Builders, Unit Testing with Mock objects, processing XML, working with Databases and creating your own Domain-Specific Languages (DSLs).

Pro Git


Scott Chacon - 2009
    It took the open source world by storm since its inception in 2005, and is used by small development shops and giants like Google, Red Hat, and IBM, and of course many open source projects.A book by Git experts to turn you into a Git expert. Introduces the world of distributed version control Shows how to build a Git development workflow.

Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks


Will Kurt - 2019
    But many people use data in ways they don't even understand, meaning they aren't getting the most from it. Bayesian Statistics the Fun Way will change that.This book will give you a complete understanding of Bayesian statistics through simple explanations and un-boring examples. Find out the probability of UFOs landing in your garden, how likely Han Solo is to survive a flight through an asteroid shower, how to win an argument about conspiracy theories, and whether a burglary really was a burglary, to name a few examples.By using these off-the-beaten-track examples, the author actually makes learning statistics fun. And you'll learn real skills, like how to:- How to measure your own level of uncertainty in a conclusion or belief- Calculate Bayes theorem and understand what it's useful for- Find the posterior, likelihood, and prior to check the accuracy of your conclusions- Calculate distributions to see the range of your data- Compare hypotheses and draw reliable conclusions from themNext time you find yourself with a sheaf of survey results and no idea what to do with them, turn to Bayesian Statistics the Fun Way to get the most value from your data.

Data Analysis Using SQL and Excel


Gordon S. Linoff - 2007
    This book helps you use SQL and Excel to extract business information from relational databases and use that data to define business dimensions, store transactions about customers, produce results, and more. Each chapter explains when and why to perform a particular type of business analysis in order to obtain useful results, how to design and perform the analysis using SQL and Excel, and what the results should look like.

Mac OS X Snow Leopard: The Missing Manual


David Pogue - 2009
    Fortunately, David Pogue is back, with the humor and expertise that have made this the #1 bestselling Mac book for eight years straight. You get all the answers with jargon-free introductions to:Big-ticket changes. A 64-bit overhaul. Faster everything. A rewritten Finder. Microsoft Exchange compatibility. All-new QuickTime Player. If Apple wrote it, this book covers it.Snow Leopard Spots. This book demystifies the hundreds of smaller enhancements, too, in all 50 programs that come with the Mac: Safari, Mail, iChat, Preview, Time Machine.Shortcuts. This must be the tippiest, trickiest Mac book ever written. Undocumented surprises await on every page.Power usage. Security, networking, build-your-own Services, file sharing with Windows, even Mac OS X's Unix chassis-this one witty, expert guide makes it all crystal clear.