Feature Engineering for Machine Learning


Alice Zheng - 2018
    With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.

The Art of Readable Code


Dustin Boswell - 2010
    Over the past five years, authors Dustin Boswell and Trevor Foucher have analyzed hundreds of examples of "bad code" (much of it their own) to determine why they’re bad and how they could be improved. Their conclusion? You need to write code that minimizes the time it would take someone else to understand it—even if that someone else is you.This book focuses on basic principles and practical techniques you can apply every time you write code. Using easy-to-digest code examples from different languages, each chapter dives into a different aspect of coding, and demonstrates how you can make your code easy to understand.Simplify naming, commenting, and formatting with tips that apply to every line of codeRefine your program’s loops, logic, and variables to reduce complexity and confusionAttack problems at the function level, such as reorganizing blocks of code to do one task at a timeWrite effective test code that is thorough and concise—as well as readable"Being aware of how the code you create affects those who look at it later is an important part of developing software. The authors did a great job in taking you through the different aspects of this challenge, explaining the details with instructive examples." —Michael Hunger, passionate Software Developer

Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Kubernetes: Up & Running


Kelsey Hightower - 2016
    How's that possible? Google revealed the secret through a project called Kubernetes, an open source cluster orchestrator (based on its internal Borg system) that radically simplifies the task of building, deploying, and maintaining scalable distributed systems in the cloud. This practical guide shows you how Kubernetes and container technology can help you achieve new levels of velocity, agility, reliability, and efficiency.Authors Kelsey Hightower, Brendan Burns, and Joe Beda--who've worked on Kubernetes at Google--explain how this system fits into the lifecycle of a distributed application. You will learn how to use tools and APIs to automate scalable distributed systems, whether it is for online services, machine-learning applications, or a cluster of Raspberry Pi computers.Explore the distributed system challenges that Kubernetes addressesDive into containerized application development, using containers such as DockerCreate and run containers on Kubernetes, using Docker's Image format and container runtimeExplore specialized objects essential for running applications in productionReliably roll out new software versions without downtime or errorsGet examples of how to develop and deploy real-world applications in Kubernetes

Building Evolutionary Architectures: Support Constant Change


Neal Ford - 2017
    Over the past few years, incremental developments in core engineering practices for software development have created the foundations for rethinking how architecture changes over time, along with ways to protect important architectural characteristics as it evolves. This practical guide ties those parts together with a new way to think about architecture and time.

Grokking Algorithms An Illustrated Guide For Programmers and Other Curious People


Aditya Y. Bhargava - 2015
    The algorithms you'll use most often as a programmer have already been discovered, tested, and proven. If you want to take a hard pass on Knuth's brilliant but impenetrable theories and the dense multi-page proofs you'll find in most textbooks, this is the book for you. This fully-illustrated and engaging guide makes it easy for you to learn how to use algorithms effectively in your own programs.Grokking Algorithms is a disarming take on a core computer science topic. In it, you'll learn how to apply common algorithms to the practical problems you face in day-to-day life as a programmer. You'll start with problems like sorting and searching. As you build up your skills in thinking algorithmically, you'll tackle more complex concerns such as data compression or artificial intelligence. Whether you're writing business software, video games, mobile apps, or system utilities, you'll learn algorithmic techniques for solving problems that you thought were out of your grasp. For example, you'll be able to:Write a spell checker using graph algorithmsUnderstand how data compression works using Huffman codingIdentify problems that take too long to solve with naive algorithms, and attack them with algorithms that give you an approximate answer insteadEach carefully-presented example includes helpful diagrams and fully-annotated code samples in Python. By the end of this book, you will know some of the most widely applicable algorithms as well as how and when to use them.

The Soul of a New Machine


Tracy Kidder - 1981
    Tracy Kidder got a preview of this world in the late 1970s when he observed the engineers of Data General design and build a new 32-bit minicomputer in just one year. His thoughtful, prescient book, The Soul of a New Machine, tells stories of 35-year-old "veteran" engineers hiring recent college graduates and encouraging them to work harder and faster on complex and difficult projects, exploiting the youngsters' ignorance of normal scheduling processes while engendering a new kind of work ethic.These days, we are used to the "total commitment" philosophy of managing technical creation, but Kidder was surprised and even a little alarmed at the obsessions and compulsions he found. From in-house political struggles to workers being permitted to tease management to marathon 24-hour work sessions, The Soul of a New Machine explores concepts that already seem familiar, even old-hat, less than 20 years later. Kidder plainly admires his subjects; while he admits to hopeless confusion about their work, he finds their dedication heroic. The reader wonders, though, what will become of it all, now and in the future. —Rob Lightner

Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions


Michael G. Milton - 2009
    If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to:Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

The Art of Statistics: How to Learn from Data


David Spiegelhalter - 2019
      Statistics are everywhere, as integral to science as they are to business, and in the popular media hundreds of times a day. In this age of big data, a basic grasp of statistical literacy is more important than ever if we want to separate the fact from the fiction, the ostentatious embellishments from the raw evidence -- and even more so if we hope to participate in the future, rather than being simple bystanders. In The Art of Statistics, world-renowned statistician David Spiegelhalter shows readers how to derive knowledge from raw data by focusing on the concepts and connections behind the math. Drawing on real world examples to introduce complex issues, he shows us how statistics can help us determine the luckiest passenger on the Titanic, whether a notorious serial killer could have been caught earlier, and if screening for ovarian cancer is beneficial. The Art of Statistics not only shows us how mathematicians have used statistical science to solve these problems -- it teaches us how we too can think like statisticians. We learn how to clarify our questions, assumptions, and expectations when approaching a problem, and -- perhaps even more importantly -- we learn how to responsibly interpret the answers we receive. Combining the incomparable insight of an expert with the playful enthusiasm of an aficionado, The Art of Statistics is the definitive guide to stats that every modern person needs.

Python Crash Course: A Hands-On, Project-Based Introduction to Programming


Eric Matthes - 2015
    You'll also learn how to make your programs interactive and how to test your code safely before adding it to a project. In the second half of the book, you'll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, data visualizations with Python's super-handy libraries, and a simple web app you can deploy online.As you work through Python Crash Course, you'll learn how to: Use powerful Python libraries and tools, including matplotlib, NumPy, and PygalMake 2D games that respond to keypresses and mouse clicks, and that grow more difficult as the game progressesWork with data to generate interactive visualizationsCreate and customize simple web apps and deploy them safely onlineDeal with mistakes and errors so you can solve your own programming problemsIf you've been thinking seriously about digging into programming, Python Crash Course will get you up to speed and have you writing real programs fast. Why wait any longer? Start your engines and code!

R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics


Paul Teetor - 2011
    The R language provides everything you need to do statistical work, but its structure can be difficult to master. This collection of concise, task-oriented recipes makes you productive with R immediately, with solutions ranging from basic tasks to input and output, general statistics, graphics, and linear regression.Each recipe addresses a specific problem, with a discussion that explains the solution and offers insight into how it works. If you're a beginner, R Cookbook will help get you started. If you're an experienced data programmer, it will jog your memory and expand your horizons. You'll get the job done faster and learn more about R in the process.Create vectors, handle variables, and perform other basic functionsInput and output dataTackle data structures such as matrices, lists, factors, and data framesWork with probability, probability distributions, and random variablesCalculate statistics and confidence intervals, and perform statistical testsCreate a variety of graphic displaysBuild statistical models with linear regressions and analysis of variance (ANOVA)Explore advanced statistical techniques, such as finding clusters in your dataWonderfully readable, R Cookbook serves not only as a solutions manual of sorts, but as a truly enjoyable way to explore the R language--one practical example at a time.--Jeffrey Ryan, software consultant and R package author

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

The Annotated Turing: A Guided Tour Through Alan Turing's Historic Paper on Computability and the Turing Machine


Charles Petzold - 2008
    Turing Mathematician Alan Turing invented an imaginary computer known as the Turing Machine; in an age before computers, he explored the concept of what it meant to be "computable," creating the field of computability theory in the process, a foundation of present-day computer programming.The book expands Turing's original 36-page paper with additional background chapters and extensive annotations; the author elaborates on and clarifies many of Turing's statements, making the original difficult-to-read document accessible to present day programmers, computer science majors, math geeks, and others.Interwoven into the narrative are the highlights of Turing's own life: his years at Cambridge and Princeton, his secret work in cryptanalysis during World War II, his involvement in seminal computer projects, his speculations about artificial intelligence, his arrest and prosecution for the crime of "gross indecency," and his early death by apparent suicide at the age of 41.

You Don't Know JS: Up & Going


Kyle Simpson - 2015
    With the "You Don’t Know JS" book series, you’ll get a more complete understanding of JavaScript, including trickier parts of the language that many experienced JavaScript programmers simply avoid.The series’ first book, Up & Going, provides the necessary background for those of you with limited programming experience. By learning the basic building blocks of programming, as well as JavaScript’s core mechanisms, you’ll be prepared to dive into the other, more in-depth books in the series—and be well on your way toward true JavaScript.With this book you will: Learn the essential programming building blocks, including operators, types, variables, conditionals, loops, and functions Become familiar with JavaScript's core mechanisms such as values, function closures, this, and prototypes Get an overview of other books in the series—and learn why it’s important to understand all parts of JavaScript

REST in Practice: Hypermedia and Systems Architecture


Jim Webber - 2010
    You'll learn techniques for implementing specific Web technologies and patterns to solve the needs of a typical company as it grows from modest beginnings to become a global enterprise.Learn basic Web techniques for application integrationUse HTTP and the Web’s infrastructure to build scalable, fault-tolerant enterprise applicationsDiscover the Create, Read, Update, Delete (CRUD) pattern for manipulating resourcesBuild RESTful services that use hypermedia to model state transitions and describe business protocolsLearn how to make Web-based solutions secure and interoperableExtend integration patterns for event-driven computing with the Atom Syndication Format and implement multi-party interactions in AtomPubUnderstand how the Semantic Web will impact systems design