Book picks similar to
Creating a Data-Driven Organization: Practical Advice from the Trenches by Carl Anderson
business
data
non-fiction
data-science
Succeeding with Agile: Software Development Using Scrum
Mike Cohn - 2009
Leading agile consultant and practitioner Mike Cohn presents detailed recommendations, powerful tips, and real-world case studies drawn from his unparalleled experience helping hundreds of software organizations make Scrum and agile work. "Succeeding with Agile" is for pragmatic software professionals who want real answers to the most difficult challenges they face in implementing Scrum. Cohn covers every facet of the transition: getting started, helping individuals transition to new roles, structuring teams, scaling up, working with a distributed team, and finally, implementing effective metrics and continuous improvement.Throughout, Cohn presents “Things to Try Now” sections based on his most successful advice. Complementary “Objection” sections reproduce typical conversations with those resisting change and offer practical guidance for addressing their concerns. Coverage includes: - Practical ways to get started immediately–and “get good” fast - Overcoming individual resistance to the changes Scrum requires - Staffing Scrum projects and building effective teams - Establishing “improvement communities” of people who are passionate about driving change - Choosing which agile technical practices to use or experiment with - Leading self-organizing teams - Making the most of Scrum sprints, planning, and quality techniques - Scaling Scrum to distributed, multiteam projects - Using Scrum on projects with complex sequential processes or challenging compliance and governance requirements - Understanding Scrum’s impact on HR, facilities, and project managementWhether you've completed a few sprints or multiple agile projects and whatever your role–manager, developer, coach, ScrumMaster, product owner, analyst, team lead, or project lead–this book will help you succeed with your very next project. Then, it will help you go much further: It will help you transform your entire development organization.
Accelerate: Building and Scaling High-Performing Technology Organizations
Nicole Forsgren - 2018
Through four years of groundbreaking research, Dr. Nicole Forsgren, Jez Humble, and Gene Kim set out to find a way to measure software delivery performance—and what drives it—using rigorous statistical methods. This book presents both the findings and the science behind that research. Readers will discover how to measure the performance of their teams, and what capabilities they should invest in to drive higher performance.
The Year Without Pants: WordPress.com and the Future of Work
Scott Berkun - 2013
The force behind WordPress.com is a convention-defying company called Automattic, Inc., whose 120 employees work from anywhere in the world they wish, barely use email, and launch improvements to their products dozens of times a day. With a fraction of the resources of Google, Amazon, or Facebook, they have a similar impact on the future of the Internet. How is this possible? What's different about how they work, and what can other companies learn from their methods?To find out, former Microsoft veteran Scott Berkun worked as a manager at WordPress.com, leading a team of young programmers developing new ideas. "The Year Without Pants" shares the secrets of WordPress.com's phenomenal success from the inside. Berkun's story reveals insights on creativity, productivity, and leadership from the kind of workplace that might be in everyone's future.Offers a fast-paced and entertaining insider's account of how an amazing, powerful organization achieves impressive resultsIncludes vital lessons about work culture and managing creativityWritten by author and popular blogger Scott Berkun (scottberkun.com)"The Year Without Pants" shares what every organization can learn from the world-changing ideas for the future of work at the heart of Automattic's success.
The Principles of Product Development Flow: Second Generation Lean Product Development
Donald G. Reinertsen - 2009
He explains why invisible and unmanaged queues are the underlying root cause of poor product development performance. He shows why these queues form and how they undermine the speed, quality, and efficiency in product development.
Slack: Getting Past Burnout, Busywork, and the Myth of Total Efficiency
Tom DeMarco - 2001
That principle is the value of slack, the degree of freedom in a company that allows it to change. Implementing slack could be as simple as adding an assistant to a department and letting high-priced talent spend less time at the photocopier and more time making key decisions, or it could mean designing workloads that allow people room to think, innovate, and reinvent themselves. It means embracing risk, eliminating fear, and knowing when to go slow. Slack allows for change, fosters creativity, promotes quality, and, above all, produces growth. With an approach that works for new- and old-economy companies alike, this revolutionary handbook debunks commonly held assumptions about real-world management, and gives you and your company a brand-new model for achieving and maintaining true effectiveness.
Exploring CQRS and Event Sourcing
Dominic Betts - 2012
It presents a learning journey, not definitive guidance. It describes the experiences of a development team with no prior CQRS proficiency in building, deploying (to Windows Azure), and maintaining a sample real-world, complex, enterprise system to showcase various CQRS and ES concepts, challenges, and techniques.The development team did not work in isolation; we actively sought input from industry experts and from a wide group of advisors to ensure that the guidance is both detailed and practical.The CQRS pattern and event sourcing are not mere simplistic solutions to the problems associated with large-scale, distributed systems. By providing you with both a working application and written guidance, we expect you’ll be well prepared to embark on your own CQRS journey.
What Is Data Science?
Mike Loukides - 2011
Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.
Hadoop: The Definitive Guide
Tom White - 2009
Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!
Extreme Programming Explained: Embrace Change (The XP Series)
Kent Beck - 1999
If you are seriously interested in understanding how you and your team can start down the path of improvement with XP, you must read this book."-- Francesco Cirillo, Chief Executive Officer, XPLabs S.R.L. "The first edition of this book told us what XP was--it changed the way many of us think about software development. This second edition takes it farther and gives us a lot more of the 'why' of XP, the motivations and the principles behind the practices. This is great stuff. Armed with the 'what' and the 'why, ' we can now all set out to confidently work on the 'how' how to run our projects better, and how to get agile techniques adopted in our organizations."-- Dave Thomas, The Pragmatic Programmers LLC "This book is dynamite! It was revolutionary when it first appeared a few years ago, and this new edition is equally profound. For those who insist on cookbook checklists, there's an excellent chapter on 'primary practices, ' but I urge you to begin by truly contemplating the meaning of the opening sentence in the first chapter of Kent Beck's book: 'XP is about social change.' You should do whatever it takes to ensure that every IT professional and every IT manager--all the way up to the CIO--has a copy of Extreme Programming Explained on his or her desk."-- Ed Yourdon, author and consultant "XP is a powerful set of concepts for simplifying the process of software design, development, and testing. It is about minimalism and incrementalism, which are especially useful principles when tackling complex problems that require a balance of creativity and discipline."-- Michael A. Cusumano, Professor, MIT Sloan School of Management, and author of The Business of Software " Extreme Programming Explained is the work of a talented and passionate craftsman. Kent Beck has brought together a compelling collection of ideas about programming and management that deserves your full attention. My only beef is that our profession has gotten to a point where such common-sense ideas are labeled 'extreme.'..."-- Lou Mazzucchelli, Fellow, Cutter Business Technology Council "If your organization is ready for a change in the way it develops software, there's the slow incremental approach, fixing things one by one, or the fast track, jumping feet first into Extreme Programming. Do not be frightened by the name, it is not that extreme at all. It is mostly good old recipes and common sense, nicely integrated together, getting rid of all the fat that has accumulated over the years."-- Philippe Kruchten, UBC, Vancouver, British Columbia "Sometimes revolutionaries get left behind as the movement they started takes on a life of its own. In this book, Kent Beck shows that he remains ahead of the curve, leading XP to its next level. Incorporating five years of feedback, this book takes a fresh look at what it takes to develop better software in less time and for less money. There are no silver bullets here, just a set of practical principles that, when used wisely, can lead to dramatic improvements in software development productivity."-- Mary Poppendieck, author of Lean Software Development: An Agile Toolkit "Kent Beck has revised his classic book based on five more years of applying and teaching XP. He shows how the path to XP is both
Practical Statistics for Data Scientists: 50 Essential Concepts
Peter Bruce - 2017
Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Ralph Kimball - 1996
Here is a complete library of dimensional modeling techniques-- the most comprehensive collection ever written. Greatly expanded to cover both basic and advanced techniques for optimizing data warehouse design, this second edition to Ralph Kimball's classic guide is more than sixty percent updated.The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including:* Retail sales and e-commerce* Inventory management* Procurement* Order management* Customer relationship management (CRM)* Human resources management* Accounting* Financial services* Telecommunications and utilities* Education* Transportation* Health care and insuranceBy the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.This book is also available as part of the Kimball's Data Warehouse Toolkit Classics Box Set (ISBN: 9780470479575) with the following 3 books:The Data Warehouse Toolkit, 2nd Edition (9780471200246)The Data Warehouse Lifecycle Toolkit, 2nd Edition (9780470149775)The Data Warehouse ETL Toolkit (9780764567575)
Machine Learning: A Probabilistic Perspective
Kevin P. Murphy - 2012
Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Doing Data Science
Cathy O'Neil - 2013
But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.
Mining of Massive Datasets
Anand Rajaraman - 2011
This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.