Book picks similar to
Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work by Q. Ethan McCallum
data-science
big-data
programming
non-fiction
Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema
Lawrence Corr - 2011
This book describes BEAM✲, an agile approach to dimensional modeling, for improving communication between data warehouse designers, BI stakeholders and the whole DW/BI development team. BEAM✲ provides tools and techniques that will encourage DW/BI designers and developers to move away from their keyboards and entity relationship based tools and model interactively with their colleagues. The result is everyone thinks dimensionally from the outset! Developers understand how to efficiently implement dimensional modeling solutions. Business stakeholders feel ownership of the data warehouse they have created, and can already imagine how they will use it to answer their business questions. Within this book, you will learn: ✲ Agile dimensional modeling using Business Event Analysis & Modeling (BEAM✲) ✲ Modelstorming: data modeling that is quicker, more inclusive, more productive, and frankly more fun! ✲ Telling dimensional data stories using the 7Ws (who, what, when, where, how many, why and how) ✲ Modeling by example not abstraction; using data story themes, not crow's feet, to describe detail ✲ Storyboarding the data warehouse to discover conformed dimensions and plan iterative development ✲ Visual modeling: sketching timelines, charts and grids to model complex process measurement - simply ✲ Agile design documentation: enhancing star schemas with BEAM✲ dimensional shorthand notation ✲ Solving difficult DW/BI performance and usability problems with proven dimensional design patterns Lawrence Corr is a data warehouse designer and educator. As Principal of DecisionOne Consulting, he helps clients to review and simplify their data warehouse designs, and advises vendors on visual data modeling techniques. He regularly teaches agile dimensional modeling courses worldwide and has taught dimensional DW/BI skills to thousands of students. Jim Stagnitto is a data warehouse and master data management architect specializing in the healthcare, financial services, and information service industries. He is the founder of the data warehousing and data mining consulting firm Llumino.
Ambient Findability: What We Find Changes Who We Become
Peter Morville - 2005
Written by Peter Morville, author of the groundbreaking Information Architecture for the World Wide Web, the book defines our current age as a state of unlimited findability. In other words, anyone can find anything at any time. Complete navigability.Morville discusses the Internet, GIS, and other network technologies that are coming together to make unlimited findability possible. He explores how the melding of these innovations impacts society, since Web access is now a standard requirement for successful people and businesses. But before he does that, Morville looks back at the history of wayfinding and human evolution, suggesting that our fear of being lost has driven us to create maps, charts, and now, the mobile Internet.The book's central thesis is that information literacy, information architecture, and usability are all critical components of this new world order. Hand in hand with that is the contention that only by planning and designing the best possible software, devices, and Internet, will we be able to maintain this connectivity in the future. Morville's book is highlighted with full color illustrations and rich examples that bring his prose to life.Ambient Findability doesn't preach or pretend to know all the answers. Instead, it presents research, stories, and examples in support of its novel ideas. Are we truly at a critical point in our evolution where the quality of our digital networks will dictate how we behave as a species? Is findability indeed the primary key to a successful global marketplace in the 21st century and beyond. Peter Morville takes you on a thought-provoking tour of these memes and more -- ideas that will not only fascinate but will stir your creativity in practical ways that you can apply to your work immediately.
Fundamentals of Software Architecture: An Engineering Approach
Mark Richards - 2020
Until now. This practical guide provides the first comprehensive overview of software architecture's many aspects. You'll examine architectural characteristics, architectural patterns, component determination, diagramming and presenting architecture, evolutionary architecture, and many other topics.Authors Neal Ford and Mark Richards help you learn through examples in a variety of popular programming languages, such as Java, C#, JavaScript, and others. You'll focus on architecture principles with examples that apply across all technology stacks.
97 Things Every Programmer Should Know: Collective Wisdom from the Experts
Kevlin Henney - 2010
With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan
User Story Mapping: Discover the Whole Story, Build the Right Product
Jeff Patton - 2012
With this practical book, you'll explore the often-misunderstood practice of user story mapping, and learn how it can help keep your team stay focused on users and their experience throughout the development process.You and your team will learn that user stories aren't a way to write better specifications, but a way to organize and have better conversations. This book will help you understand what kinds of conversations you should be having, when to have them, and what to keep track of when you do. Learn the key concepts used to create a great story map. Understand how user stories really work, and how to make good use of them in agile and lean projects. Examine the nuts and bolts of managing stories through the development cycle. Use strategies that help you continue to learn before and after the product's release to customers and usersUser Story Mapping is ideal for agile and lean software development team members, product managers and UX practitioners in commercial product companies, and business analysts and project managers in IT organizations—whether you're new to this approach or want to understand more about it.
Growing Object-Oriented Software, Guided by Tests
Steve Freeman - 2009
This one's a keeper." --Robert C. Martin "If you want to be an expert in the state of the art in TDD, you need to understand the ideas in this book."--Michael Feathers Test-Driven Development (TDD) is now an established technique for delivering better software faster. TDD is based on a simple idea: Write tests for your code before you write the code itself. However, this simple idea takes skill and judgment to do well. Now there's a practical guide to TDD that takes you beyond the basic concepts. Drawing on a decade of experience building real-world systems, two TDD pioneers show how to let tests guide your development and "grow" software that is coherent, reliable, and maintainable. Steve Freeman and Nat Pryce describe the processes they use, the design principles they strive to achieve, and some of the tools that help them get the job done. Through an extended worked example, you'll learn how TDD works at multiple levels, using tests to drive the features and the object-oriented structure of the code, and using Mock Objects to discover and then describe relationships between objects. Along the way, the book systematically addresses challenges that development teams encounter with TDD--from integrating TDD into your processes to testing your most difficult features. Coverage includes - Implementing TDD effectively: getting started, and maintaining your momentum throughout the project - Creating cleaner, more expressive, more sustainable code - Using tests to stay relentlessly focused on sustaining quality - Understanding how TDD, Mock Objects, and Object-Oriented Design come together in the context of a real software development project - Using Mock Objects to guide object-oriented designs - Succeeding where TDD is difficult: managing complex test data, and testing persistence and concurrency
An Introduction to Statistical Learning: With Applications in R
Gareth James - 2013
This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Cassandra: The Definitive Guide
Eben Hewitt - 2010
Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.Understand the tenets of Cassandra's column-oriented structureLearn how to write, update, and read Cassandra dataDiscover how to add or remove nodes from the cluster as your application requiresExamine a working application that translates from a relational model to Cassandra's data modelUse examples for writing clients in Java, Python, and C#Use the JMX interface to monitor a cluster's usage, memory patterns, and moreTune memory settings, data storage, and caching for better performance
Data Analysis Using SQL and Excel
Gordon S. Linoff - 2007
This book helps you use SQL and Excel to extract business information from relational databases and use that data to define business dimensions, store transactions about customers, produce results, and more. Each chapter explains when and why to perform a particular type of business analysis in order to obtain useful results, how to design and perform the analysis using SQL and Excel, and what the results should look like.
Python Data Science Handbook: Tools and Techniques for Developers
Jake Vanderplas - 2016
Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms
Site Reliability Engineering: How Google Runs Production Systems
Betsy Beyer - 2016
So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems?In this collection of essays and articles, key members of Google's Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You'll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient--lessons directly applicable to your organization.This book is divided into four sections: Introduction--Learn what site reliability engineering is and why it differs from conventional IT industry practicesPrinciples--Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)Practices--Understand the theory and practice of an SRE's day-to-day work: building and operating large distributed computing systemsManagement--Explore Google's best practices for training, communication, and meetings that your organization can use
Coders at Work: Reflections on the Craft of Programming
Peter Seibel - 2009
As the words "at work" suggest, Peter Seibel focuses on how his interviewees tackle the day–to–day work of programming, while revealing much more, like how they became great programmers, how they recognize programming talent in others, and what kinds of problems they find most interesting. Hundreds of people have suggested names of programmers to interview on the Coders at Work web site: http://www.codersatwork.com. The complete list was 284 names. Having digested everyone’s feedback, we selected 16 folks who’ve been kind enough to agree to be interviewed:- Frances Allen: Pioneer in optimizing compilers, first woman to win the Turing Award (2006) and first female IBM fellow- Joe Armstrong: Inventor of Erlang- Joshua Bloch: Author of the Java collections framework, now at Google- Bernie Cosell: One of the main software guys behind the original ARPANET IMPs and a master debugger- Douglas Crockford: JSON founder, JavaScript architect at Yahoo!- L. Peter Deutsch: Author of Ghostscript, implementer of Smalltalk-80 at Xerox PARC and Lisp 1.5 on PDP-1- Brendan Eich: Inventor of JavaScript, CTO of the Mozilla Corporation - Brad Fitzpatrick: Writer of LiveJournal, OpenID, memcached, and Perlbal - Dan Ingalls: Smalltalk implementor and designer- Simon Peyton Jones: Coinventor of Haskell and lead designer of Glasgow Haskell Compiler- Donald Knuth: Author of The Art of Computer Programming and creator of TeX- Peter Norvig: Director of Research at Google and author of the standard text on AI- Guy Steele: Coinventor of Scheme and part of the Common Lisp Gang of Five, currently working on Fortress- Ken Thompson: Inventor of UNIX- Jamie Zawinski: Author of XEmacs and early Netscape/Mozilla hackerWhat you’ll learn:How the best programmers in the world do their jobWho is this book for?Programmers interested in the point of view of leaders in the field. Programmers looking for approaches that work for some of these outstanding programmers.
MongoDB: The Definitive Guide
Kristina Chodorow - 2010
Learn how easy it is to handle data as self-contained JSON-style documents, rather than as records in a relational database.Explore ways that document-oriented storage will work for your projectLearn how MongoDB’s schema-free data model handles documents, collections, and multiple databasesExecute basic write operations, and create complex queries to find data with any criteriaUse indexes, aggregation tools, and other advanced query techniquesLearn about monitoring, security and authentication, backup and repair, and moreSet up master-slave and automatic failover replication in MongoDBUse sharding to scale MongoDB horizontally, and learn how it impacts applicationsGet example applications written in Java, PHP, Python, and Ruby
The Passionate Programmer
Chad Fowler - 2009
In this book, you'll learn how to become an entrepreneur, driving your career in the direction of your choosing. You'll learn how to build your software development career step by step, following the same path that you would follow if you were building, marketing, and selling a product. After all, your skills themselves are a product. The choices you make about which technologies to focus on and which business domains to master have at least as much impact on your success as your technical knowledge itself--don't let those choices be accidental. We'll walk through all aspects of the decision-making process, so you can ensure that you're investing your time and energy in the right areas. You'll develop a structured plan for keeping your mind engaged and your skills fresh. You'll learn how to assess your skills in terms of where they fit on the value chain, driving you away from commodity skills and toward those that are in high demand. Through a mix of high-level, thought-provoking essays and tactical "Act on It" sections, you will come away with concrete plans you can put into action immediately. You'll also get a chance to read the perspectives of several highly successful members of our industry from a variety of career paths. As with any product or service, if nobody knows what you're selling, nobody will buy. We'll walk through the often-neglected world of marketing, and you'll create a plan to market yourself both inside your company and to the industry in general. Above all, you'll see how you can set the direction of your career, leading to a more fulfilling and remarkable professional life.
The Mythical Man-Month: Essays on Software Engineering
Frederick P. Brooks Jr. - 1975
With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."