Programming Collective Intelligence: Building Smart Web 2.0 Applications


Toby Segaran - 2002
    With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:Collaborative filtering techniques that enable online retailers to recommend products or media Methods of clustering to detect groups of similar items in a large dataset Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm Optimization algorithms that search millions of possible solutions to a problem and choose the best one Bayesian filtering, used in spam filters for classifying documents based on word types and other features Using decision trees not only to make predictions, but to model the way decisions are made Predicting numerical values rather than classifications to build price models Support vector machines to match people in online dating sites Non-negative matrix factorization to find the independent features in a dataset Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a game Each chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."-- Tim Wolters, CTO, Collective Intellect

Fundamentals of Data Structures in C++


Ellis Horowitz - 1995
    Fundamentals of Data Structures in C++ offers a complete rendering of basic data structure implementations, enhanced by superior pedagogy and astute analyses.

I Heart Logs: Event Data, Stream Processing, and Data Integration


Jay Kreps - 2014
    Even though most engineers don't think much about them, this short book shows you why logs are worthy of your attention.Based on his popular blog posts, LinkedIn principal engineer Jay Kreps shows you how logs work in distributed systems, and then delivers practical applications of these concepts in a variety of common uses--data integration, enterprise architecture, real-time stream processing, data system design, and abstract computing models.Go ahead and take the plunge with logs; you're going love them.Learn how logs are used for programmatic access in databases and distributed systemsDiscover solutions to the huge data integration problem when more data of more varieties meet more systemsUnderstand why logs are at the heart of real-time stream processingLearn the role of a log in the internals of online data systemsExplore how Jay Kreps applies these ideas to his own work on data infrastructure systems at LinkedIn

Web Scraping with Python: Collecting Data from the Modern Web


Ryan Mitchell - 2015
    With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice. Learn how to parse complicated HTML pages Traverse multiple pages and sites Get a general overview of APIs and how they work Learn several methods for storing the data you scrape Download, read, and extract data from documents Use tools and techniques to clean badly formatted data Read and write natural languages Crawl through forms and logins Understand how to scrape JavaScript Learn image processing and text recognition

SQL Performance Explained


Markus Winand - 2011
    The focus is on SQL-it covers all major SQL databases without getting lost in the details of any one specific product. Starting with the basics of indexing and the WHERE clause, SQL Performance Explained guides developers through all parts of an SQL statement and explains the pitfalls of object-relational mapping (ORM) tools like Hibernate. Topics covered include: Using multi-column indexes; Correctly applying SQL functions; Efficient use of LIKE queries; Optimizing join operations; Clustering data to improve performance; Pipelined execution of ORDER BY and GROUP BY; Getting the best performance for pagination queries; Understanding the scalability of databases. Its systematic structure makes SQL Performance Explained both a textbook and a reference manual that should be on every developer's bookshelf.

Data Science at the Command Line: Facing the Future with Time-Tested Tools


Jeroen Janssens - 2014
    You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms

Networking for Systems Administrators (IT Mastery Book 5)


Michael W. Lucas - 2015
    Servers give sysadmins a incredible visibility into the network—once they know how to unlock it. Most sysadmins don’t need to understand window scaling, or the differences between IPv4 and IPv6 echo requests, or other intricacies of the TCP/IP protocols. You need only enough to deploy your own applications and get easy support from the network team.This book teaches you:•How modern networks really work•The essentials of TCP/IP•The next-generation protocol, IPv6•The right tools to diagnose network problems, and how to use them•Troubleshooting everything from the physical wire to DNS•How to see the traffic you send and receive•Connectivity testing•How to communicate with your network team to quickly resolve problemsA systems administrator doesn’t need to know the innards of TCP/IP, but knowing enough to diagnose your own network issues transforms a good sysadmin into a great one.

The Book of Why: The New Science of Cause and Effect


Judea Pearl - 2018
    Today, that taboo is dead. The causal revolution, instigated by Judea Pearl and his colleagues, has cut through a century of confusion and established causality -- the study of cause and effect -- on a firm scientific basis. His work explains how we can know easy things, like whether it was rain or a sprinkler that made a sidewalk wet; and how to answer hard questions, like whether a drug cured an illness. Pearl's work enables us to know not just whether one thing causes another: it lets us explore the world that is and the worlds that could have been. It shows us the essence of human thought and key to artificial intelligence. Anyone who wants to understand either needs The Book of Why.

Data Science For Dummies


Lillian Pierson - 2014
    Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization’s massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you’ll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals before moving on to working with relational databases and unstructured data and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It’s a big, big data world out there – let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Automate the Boring Stuff with Python: Practical Programming for Total Beginners


Al Sweigart - 2014
    But what if you could have your computer do them for you?In "Automate the Boring Stuff with Python," you'll learn how to use Python to write programs that do in minutes what would take you hours to do by hand no prior programming experience required. Once you've mastered the basics of programming, you'll create Python programs that effortlessly perform useful and impressive feats of automation to: Search for text in a file or across multiple filesCreate, update, move, and rename files and foldersSearch the Web and download online contentUpdate and format data in Excel spreadsheets of any sizeSplit, merge, watermark, and encrypt PDFsSend reminder emails and text notificationsFill out online formsStep-by-step instructions walk you through each program, and practice projects at the end of each chapter challenge you to improve those programs and use your newfound skills to automate similar tasks.Don't spend your time doing work a well-trained monkey could do. Even if you've never written a line of code, you can make your computer do the grunt work. Learn how in "Automate the Boring Stuff with Python.""

Python Testing with Pytest: Simple, Rapid, Effective, and Scalable


Brian Okken - 2017
    The pytest testing framework helps you write tests quickly and keep them readable and maintainable - with no boilerplate code. Using a robust yet simple fixture model, it's just as easy to write small tests with pytest as it is to scale up to complex functional testing for applications, packages, and libraries. This book shows you how.For Python-based projects, pytest is the undeniable choice to test your code if you're looking for a full-featured, API-independent, flexible, and extensible testing framework. With a full-bodied fixture model that is unmatched in any other tool, the pytest framework gives you powerful features such as assert rewriting and plug-in capability - with no boilerplate code.With simple step-by-step instructions and sample code, this book gets you up to speed quickly on this easy-to-learn and robust tool. Write short, maintainable tests that elegantly express what you're testing. Add powerful testing features and still speed up test times by distributing tests across multiple processors and running tests in parallel. Use the built-in assert statements to reduce false test failures by separating setup and test failures. Test error conditions and corner cases with expected exception testing, and use one test to run many test cases with parameterized testing. Extend pytest with plugins, connect it to continuous integration systems, and use it in tandem with tox, mock, coverage, unittest, and doctest.Write simple, maintainable tests that elegantly express what you're testing and why.What You Need: The examples in this book are written using Python 3.6 and pytest 3.0. However, pytest 3.0 supports Python 2.6, 2.7, and Python 3.3-3.6.

Good Math: A Geek's Guide to the Beauty of Numbers, Logic, and Computation


Mark C. Chu-Carroll - 2013
    There is joy and beauty in mathematics, and in more than two dozen essays drawn from his popular “Good Math” blog, you’ll find concepts, proofs, and examples that are often surprising, counterintuitive, or just plain weird.Mark begins his journey with the basics of numbers, with an entertaining trip through the integers and the natural, rational, irrational, and transcendental numbers. The voyage continues with a look at some of the oddest numbers in mathematics, including zero, the golden ratio, imaginary numbers, Roman numerals, and Egyptian and continuing fractions. After a deep dive into modern logic, including an introduction to linear logic and the logic-savvy Prolog language, the trip concludes with a tour of modern set theory and the advances and paradoxes of modern mechanical computing.If your high school or college math courses left you grasping for the inner meaning behind the numbers, Mark’s book will both entertain and enlighten you.

Introducing Python: Modern Computing in Simple Packages


Bill Lubanovic - 2013
    In addition to giving a strong foundation in the language itself, Lubanovic shows how to use it for a range of applications in business, science, and the arts, drawing on the rich collection of open source packages developed by Python fans.It's impressive how many commercial and production-critical programs are written now in Python. Developed to be easy to read and maintain, it has proven a boon to anyone who wants applications that are quick to write but robust and able to remain in production for the long haul.This book focuses on the current version of Python, 3.x, while including sidebars about important differences with 2.x for readers who may have to deal with programs in that version.

Advanced Swift


Chris Eidhof - 2016
    If you have read the Swift Programming Guide, and want to explore more, this book is for you.Swift is a great language for systems programming, but also lends itself for very high-level programming. We'll explore both high-level topics (for example, programming with generics and protocols), as well as low-level topics (for example, wrapping a C library and string internals).