Site Reliability Engineering: How Google Runs Production Systems


Betsy Beyer - 2016
    So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems?In this collection of essays and articles, key members of Google's Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You'll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient--lessons directly applicable to your organization.This book is divided into four sections: Introduction--Learn what site reliability engineering is and why it differs from conventional IT industry practicesPrinciples--Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)Practices--Understand the theory and practice of an SRE's day-to-day work: building and operating large distributed computing systemsManagement--Explore Google's best practices for training, communication, and meetings that your organization can use

Field Guide to Understanding Human Error


Sidney Dekker - 2002
    You think you can solve your human error problem by telling people to be more careful, by reprimanding the miscreants, by issuing a new rule or procedure. These are all expressions of 'The Bad Apple Theory', where you believe your system is basically safe if it were not for those few unreliable people in it. This old view of human error is increasingly outdated and will lead you nowhere. The new view, in contrast, understands that a human error problem is actually an organizational problem. Finding a 'human error' by any other name, or by any other human, is only the beginning of your journey, not a convenient conclusion. The new view recognizes that systems are inherent trade-offs between safety and other pressures (for example: production). People need to create safety through practice, at all levels of an organization. Breaking new ground beyond its successful predecessor, The Field Guide to Understanding Human Error guides you through the traps and misconceptions of the old view. It explains how to avoid the hindsight bias, to zoom out from the people closest in time and place to the mishap, and resist the temptation of counterfactual reasoning and judgmental language. But it also helps you look forward. It suggests how to apply the new view in building your safety department, handling questions about accountability, and constructing meaningful countermeasures. It even helps you in getting your organization to adopt the new view and improve its learning from failure. So if you are faced by a human error problem, abandon the fallacy of a quick fix. Read this book.

Extreme Programming Explained: Embrace Change (The XP Series)


Kent Beck - 1999
    If you are seriously interested in understanding how you and your team can start down the path of improvement with XP, you must read this book."-- Francesco Cirillo, Chief Executive Officer, XPLabs S.R.L. "The first edition of this book told us what XP was--it changed the way many of us think about software development. This second edition takes it farther and gives us a lot more of the 'why' of XP, the motivations and the principles behind the practices. This is great stuff. Armed with the 'what' and the 'why, ' we can now all set out to confidently work on the 'how' how to run our projects better, and how to get agile techniques adopted in our organizations."-- Dave Thomas, The Pragmatic Programmers LLC "This book is dynamite! It was revolutionary when it first appeared a few years ago, and this new edition is equally profound. For those who insist on cookbook checklists, there's an excellent chapter on 'primary practices, ' but I urge you to begin by truly contemplating the meaning of the opening sentence in the first chapter of Kent Beck's book: 'XP is about social change.' You should do whatever it takes to ensure that every IT professional and every IT manager--all the way up to the CIO--has a copy of Extreme Programming Explained on his or her desk."-- Ed Yourdon, author and consultant "XP is a powerful set of concepts for simplifying the process of software design, development, and testing. It is about minimalism and incrementalism, which are especially useful principles when tackling complex problems that require a balance of creativity and discipline."-- Michael A. Cusumano, Professor, MIT Sloan School of Management, and author of The Business of Software " Extreme Programming Explained is the work of a talented and passionate craftsman. Kent Beck has brought together a compelling collection of ideas about programming and management that deserves your full attention. My only beef is that our profession has gotten to a point where such common-sense ideas are labeled 'extreme.'..."-- Lou Mazzucchelli, Fellow, Cutter Business Technology Council "If your organization is ready for a change in the way it develops software, there's the slow incremental approach, fixing things one by one, or the fast track, jumping feet first into Extreme Programming. Do not be frightened by the name, it is not that extreme at all. It is mostly good old recipes and common sense, nicely integrated together, getting rid of all the fat that has accumulated over the years."-- Philippe Kruchten, UBC, Vancouver, British Columbia "Sometimes revolutionaries get left behind as the movement they started takes on a life of its own. In this book, Kent Beck shows that he remains ahead of the curve, leading XP to its next level. Incorporating five years of feedback, this book takes a fresh look at what it takes to develop better software in less time and for less money. There are no silver bullets here, just a set of practical principles that, when used wisely, can lead to dramatic improvements in software development productivity."-- Mary Poppendieck, author of Lean Software Development: An Agile Toolkit "Kent Beck has revised his classic book based on five more years of applying and teaching XP. He shows how the path to XP is both

The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations


Gene Kim - 2015
    For decades, technology leaders have struggled to balance agility, reliability, and security. The consequences of failure have never been greater whether it's the healthcare.gov debacle, cardholder data breaches, or missing the boat with Big Data in the cloud.And yet, high performers using DevOps principles, such as Google, Amazon, Facebook, Etsy, and Netflix, are routinely and reliably deploying code into production hundreds, or even thousands, of times per day.Following in the footsteps of The Phoenix Project, The DevOps Handbook shows leaders how to replicate these incredible outcomes, by showing how to integrate Product Management, Development, QA, IT Operations, and Information Security to elevate your company and win in the marketplace."Table of contentsPrefaceSpreading the Aha! MomentIntroductionPART I: THE THREE WAYS1. Agile, continuous delivery and the three ways2. The First Way: The Principles of Flow3. The Second Way: The Principle of Feedback4. The Third Way: The Principles of Continual LearningPART II: WHERE TO START5. Selecting which value stream to start with6. Understanding the work in our value stream…7. How to design our organization and architecture8. How to get great outcomes by integrating operations into the daily work for developmentPART III: THE FIRST WAY: THE TECHNICAL PRACTICES OF FLOW9. Create the foundations of our deployment pipeline10. Enable fast and reliable automated testing11. Enable and practice continuous integration12. Automate and enable low-risk releases13. Architect for low-risk releasesPART IV: THE SECOND WAY: THE TECHNICAL PRACTICES OF FEEDBACK14*. Create telemetry to enable seeing abd solving problems15. Analyze telemetry to better anticipate problems16. Enable feedbackso development and operation can safely deploy code17. Integrate hypothesis-driven development and A/B testing into our daily work18. Create review and coordination processes to increase quality of our current workPART V: THE THRID WAY: THE TECHNICAL PRACTICES OF CONTINUAL LEARNING19. Enable and inject learning into daily work20. Convert local discoveries into global improvements21. Reserve time to create organizational learning22. Information security as everyone’s job, every day23. Protecting the deployment pipelinePART VI: CONCLUSIONA call to actionConclusion to the DevOps HandbookAPPENDICES1. The convergence of Devops2. The theory of constraints and core chronic conflicts3. Tabular form of downward spiral4. The dangers of handoffs and queues5. Myths of industrial safety6. The Toyota Andon Cord7. COTS Software8. Post-mortem meetings9. The Simian Army10. Transparent uptimeAdditional ResourcesEndnotes

Designing Data-Intensive Applications


Martin Kleppmann - 2015
    Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures

An Introduction to General Systems Thinking


Gerald M. Weinberg - 1975
    Used in university courses and professional seminars all over the world, the text has proven its ability to open minds and sharpen thinking.Originally published in 1975 and reprinted more than twenty times over a quarter century -- and now available for the first time from Dorset House Publishing -- the text uses clear writing and basic algebraic principles to explore new approaches to projects, products, organizations, and virtually any kind of system.Scientists, engineers, organization leaders, managers, doctors, students, and thinkers of all disciplines can use this book to dispel the mental fog that clouds problem-solving. As author Gerald M. Weinberg writes in the new preface to the Silver Anniversary Edition, "I haven’t changed my conviction that most people don’t think nearly as well as they could had they been taught some principles of thinking.”Now an award-winning author of nearly forty books spanning the entire software development life cycle, Weinberg had already acquired extensive experience as a programmer, manager, university professor, and consultant when this book was originally published.With helpful illustrations, numerous end-of-chapter exercises, and an appendix on a mathematical notation used in problem-solving, An Introduction to General Systems Thinking may be your most powerful tool in working with problems, systems, and solutions.

Agile Estimating and Planning


Mike Cohn - 2005
    In this book, Agile Alliance cofounder Mike Cohn discusses the philosophy of agile estimating and planning and shows you exactly how to get the job done, with real-world examples and case studies.Concepts are clearly illustrated and readers are guided, step by step, toward how to answer the following questions: What will we build? How big will it be? When must it be done? How much can I really complete by then? You will first learn what makes a good plan-and then what makes it agile.Using the techniques in Agile Estimating and Planning , you can stay agile from start to finish, saving time, conserving resources, and accomplishing more. Highlights include:Why conventional prescriptive planning fails and why agile planning works How to estimate feature size using story points and ideal days--and when to use each How and when to re-estimate How to prioritize features using both financial and nonfinancial approaches How to split large features into smaller, more manageable ones How to plan iterations and predict your team's initial rate of progress How to schedule projects that have unusually high uncertainty or schedule-related risk How to estimate projects that will be worked on by multiple teams Agile Estimating and Planning supports any agile, semiagile, or iterative process, including Scrum, XP, Feature-Driven Development, Crystal, Adaptive Software Development, DSDM, Unified Process, and many more. It will be an indispensable resource for every development manager, team leader, and team member.

Time Management for System Administrators: Stop Working Late and Start Working Smart


Thomas A. Limoncelli - 2005
    No other job pulls people in so many directions at once. Users interrupt you constantly with requests, preventing you from getting anything done. Your managers want you to get long-term projects done but flood you with reques ... Available here:readmeaway.com/download?i=0596007833Time Management for System Administrators: Stop Working Late and Start Working Smart PDF by Thomas A. LimoncelliRead Time Management for System Administrators: Stop Working Late and Start Working Smart PDF from O'Reilly Media,Thomas A. LimoncelliDownload Thomas A. Limoncelli’s PDF E-book Time Management for System Administrators: Stop Working Late and Start Working Smart

The Principles of Product Development Flow: Second Generation Lean Product Development


Donald G. Reinertsen - 2009
    He explains why invisible and unmanaged queues are the underlying root cause of poor product development performance. He shows why these queues form and how they undermine the speed, quality, and efficiency in product development.

Release It!: Design and Deploy Production-Ready Software (Pragmatic Programmers)


Michael T. Nygard - 2007
    Did you design your system to survivef a sudden rush of visitors from Digg or Slashdot? Or an influx of real world customers from 100 different countries? Are you ready for a world filled with flakey networks, tangled databases, and impatient users?If you're a developer and don't want to be on call for 3AM for the rest of your life, this book will help.In Release It!, Michael T. Nygard shows you how to design and architect your application for the harsh realities it will face. You'll learn how to design your application for maximum uptime, performance, and return on investment.Mike explains that many problems with systems today start with the design.

Architecting for the AWS Cloud: Best Practices (AWS Whitepaper)


Amazon We Services - 2016
    It discusses cloud concepts and highlights various design patterns and best practices. This documentation is offered for free here as a Kindle book, or you can read it in PDF format at https://aws.amazon.com/whitepapers/.

Software Engineering at Google: Lessons Learned from Programming Over Time


Titus Winters - 2020
    With this book, you'll get a candid and insightful look at how software is constructed and maintained by some of the world's leading practitioners.Titus Winters, Tom Manshreck, and Hyrum K. Wright, software engineers and a technical writer at Google, reframe how software engineering is practiced and taught: from an emphasis on programming to an emphasis on software engineering, which roughly translates to programming over time.You'll learn:Fundamental differences between software engineering and programmingHow an organization effectively manages a living codebase and efficiently responds to inevitable changeWhy culture (and recognizing it) is important, and how processes, practices, and tools come into play

An Elegant Puzzle: Systems of Engineering Management


Will Larson - 2019
    Management is a key part of any organization, yet the discipline is often self-taught and unstructured. Getting to the good solutions of complex management challenges can make the difference between fulfillment and frustration for teams, and, ultimately, the success or failure of companies. Will Larson's An Elegant Puzzle orients around the particular challenges of engineering management--from sizing teams to technical debt to succession planning--and provides a path to the good solutions. Drawing from his experience at Digg, Uber, and Stripe, Will Larson has developed a thoughtful approach to engineering management that leaders of all levels at companies of all sizes can apply. An Elegant Puzzle balances structured principles and human-centric thinking to help any leader create more effective and rewarding organizations for engineers to thrive in.

Building Microservices: Designing Fine-Grained Systems


Sam Newman - 2014
    But developing these systems brings its own set of headaches. With lots of examples and practical advice, this book takes a holistic view of the topics that system architects and administrators must consider when building, managing, and evolving microservice architectures.Microservice technologies are moving quickly. Author Sam Newman provides you with a firm grounding in the concepts while diving into current solutions for modeling, integrating, testing, deploying, and monitoring your own autonomous services. You'll follow a fictional company throughout the book to learn how building a microservice architecture affects a single domain.Discover how microservices allow you to align your system design with your organization's goalsLearn options for integrating a service with the rest of your systemTake an incremental approach when splitting monolithic codebasesDeploy individual microservices through continuous integrationExamine the complexities of testing and monitoring distributed servicesManage security with user-to-service and service-to-service modelsUnderstand the challenges of scaling microservice architectures

Team Topologies: Organizing Business and Technology Teams for Fast Flow


Matthew Skelton - 2019
    But how do you build the best team organization for your specific goals, culture, and needs? Team Topologies is a practical, step-by-step, adaptive model for organizational design and team interaction based on four fundamental team types and three team interaction patterns. It is a model that treats teams as the fundamental means of delivery, where team structures and communication pathways are able to evolve with technological and organizational maturity.In Team Topologies, IT consultants Matthew Skelton and Manuel Pais share secrets of successful team patterns and interactions to help readers choose and evolve the right team patterns for their organization, making sure to keep the software healthy and optimize value streams.Team Topologies is a major step forward in organizational design for software, presenting a well-defined way for teams to interact and interrelate that helps make the resulting software architecture clearer and more sustainable, turning inter-team problems into valuable signals for the self-steering organization.