The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

What's New in Java 7?


Madhusudhan Konda - 2011
    Madhusudhan Konda provides an overview of these, including strings in switch statements, multi-catch exception handling, try-with-resource statements, the new File System API, extensions of the JVM, support for dynamically-typed languages, and the fork and join framework for task parallelism.

Accelerate: Building and Scaling High-Performing Technology Organizations


Nicole Forsgren - 2018
    Through four years of groundbreaking research, Dr. Nicole Forsgren, Jez Humble, and Gene Kim set out to find a way to measure software delivery performance—and what drives it—using rigorous statistical methods. This book presents both the findings and the science behind that research. Readers will discover how to measure the performance of their teams, and what capabilities they should invest in to drive higher performance.

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World


Pedro Domingos - 2015
    In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.

Fundamentals of Software Architecture: An Engineering Approach


Mark Richards - 2020
    Until now. This practical guide provides the first comprehensive overview of software architecture's many aspects. You'll examine architectural characteristics, architectural patterns, component determination, diagramming and presenting architecture, evolutionary architecture, and many other topics.Authors Neal Ford and Mark Richards help you learn through examples in a variety of popular programming languages, such as Java, C#, JavaScript, and others. You'll focus on architecture principles with examples that apply across all technology stacks.

97 Things Every Programmer Should Know: Collective Wisdom from the Experts


Kevlin Henney - 2010
    With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan

Programming Pearls


Jon L. Bentley - 1986
    Jon has done a wonderful job of updating the material. I am very impressed at how fresh the new examples seem." - Steve McConnell, author, Code CompleteWhen programmers list their favorite books, Jon Bentley's collection of programming pearls is commonly included among the classics. Just as natural pearls grow from grains of sand that irritate oysters, programming pearls have grown from real problems that have irritated real programmers. With origins beyond solid engineering, in the realm of insight and creativity, Bentley's pearls offer unique and clever solutions to those nagging problems. Illustrated by programs designed as much for fun as for instruction, the book is filled with lucid and witty descriptions of practical programming techniques and fundamental design principles. It is not at all surprising that Programming Pearls has been so highly valued by programmers at every level of experience. In this revision, the first in 14 years, Bentley has substantially updated his essays to reflect current programming methods and environments. In addition, there are three new essays on (1) testing, debugging, and timing; (2) set representations; and (3) string problems. All the original programs have been rewritten, and an equal amount of new code has been generated. Implementations of all the programs, in C or C++, are now available on the Web.What remains the same in this new edition is Bentley's focus on the hard core of programming problems and his delivery of workable solutions to those problems. Whether you are new to Bentley's classic or are revisiting his work for some fresh insight, this book is sure to make your own list of favorites.

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Machine Learning


Tom M. Mitchell - 1986
    Mitchell covers the field of machine learning, the study of algorithms that allow computer programs to automatically improve through experience and that automatically infer general laws from specific data.

Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

Storytelling with Data: A Data Visualization Guide for Business Professionals


Cole Nussbaumer Knaflic - 2015
    You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples--ready for immediate application to your next graph or presentation.Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to:Understand the importance of context and audience Determine the appropriate type of graph for your situation Recognize and eliminate the clutter clouding your information Direct your audience's attention to the most important parts of your data Think like a designer and utilize concepts of design in data visualization Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data--Storytelling with Data will give you the skills and power to tell it!

Site Reliability Engineering: How Google Runs Production Systems


Betsy Beyer - 2016
    So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems?In this collection of essays and articles, key members of Google's Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You'll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient--lessons directly applicable to your organization.This book is divided into four sections: Introduction--Learn what site reliability engineering is and why it differs from conventional IT industry practicesPrinciples--Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)Practices--Understand the theory and practice of an SRE's day-to-day work: building and operating large distributed computing systemsManagement--Explore Google's best practices for training, communication, and meetings that your organization can use

The Mythical Man-Month: Essays on Software Engineering


Frederick P. Brooks Jr. - 1975
    With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation


Jez Humble - 2010
    This groundbreaking new book sets out the principles and technical practices that enable rapid, incremental delivery of high quality, valuable new functionality to users. Through automation of the build, deployment, and testing process, and improved collaboration between developers, testers, and operations, delivery teams can get changes released in a matter of hours-- sometimes even minutes-no matter what the size of a project or the complexity of its code base. Jez Humble and David Farley begin by presenting the foundations of a rapid, reliable, low-risk delivery process. Next, they introduce the "deployment pipeline," an automated process for managing all changes, from check-in to release. Finally, they discuss the "ecosystem" needed to support continuous delivery, from infrastructure, data and configuration management to governance. The authors introduce state-of-the-art techniques, including automated infrastructure management and data migration, and the use of virtualization. For each, they review key issues, identify best practices, and demonstrate how to mitigate risks. Coverage includes - Automating all facets of building, integrating, testing, and deploying software - Implementing deployment pipelines at team and organizational levels - Improving collaboration between developers, testers, and operations - Developing features incrementally on large and distributed teams - Implementing an effective configuration management strategy - Automating acceptance testing, from analysis to implementation - Testing capacity and other non-functional requirements - Implementing continuous deployment and zero-downtime releases - Managing infrastructure, data, components and dependencies - Navigating risk management, compliance, and auditing Whether you're a developer, systems administrator, tester, or manager, this book will help your organization move from idea to release faster than ever--so you can deliver value to your business rapidly and reliably.