Book picks similar to
Data Strategy by Sid Adelman
data
physical
tech
april-10
Beautiful Visualization: Looking at Data through the Eyes of Experts
Julie Steele - 2010
Think of the familiar map of the New York City subway system, or a diagram of the human brain. Successful visualizations are beautiful not only for their aesthetic design, but also for elegant layers of detail that efficiently generate insight and new understanding.This book examines the methods of two dozen visualization experts who approach their projects from a variety of perspectives -- as artists, designers, commentators, scientists, analysts, statisticians, and more. Together they demonstrate how visualization can help us make sense of the world.Explore the importance of storytelling with a simple visualization exerciseLearn how color conveys information that our brains recognize before we're fully aware of itDiscover how the books we buy and the people we associate with reveal clues to our deeper selvesRecognize a method to the madness of air travel with a visualization of civilian air trafficFind out how researchers investigate unknown phenomena, from initial sketches to published papers Contributors include:Nick Bilton, Michael E. Driscoll, Jonathan Feinberg, Danyel Fisher, Jessica Hagy, Gregor Hochmuth, Todd Holloway, Noah Iliinsky, Eddie Jabbour, Valdean Klump, Aaron Koblin, Robert Kosara, Valdis Krebs, JoAnn Kuchera-Morin et al., Andrew Odewahn, Adam Perer, Anders Persson, Maximilian Schich, Matthias Shapiro, Julie Steele, Moritz Stefaner, Jer Thorp, Fernanda Viegas, Martin Wattenberg, and Michael Young.
Streaming Systems
Tyler Akidau - 2018
As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.Expanded from Tyler Akidau's popular blog posts Streaming 101 and Streaming 102, this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You'll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.You'll explore:How streaming and batch data processing patterns compareThe core principles and concepts behind robust out-of-order data processingHow watermarks track progress and completeness in infinite datasetsHow exactly-once data processing techniques ensure correctnessHow the concepts of streams and tables form the foundations of both batch and streaming data processingThe practical motivations behind a powerful persistent state mechanism, driven by a real-world exampleHow time-varying relations provide a link between stream processing and the world of SQL and relational algebra
Pattern Recognition and Machine Learning
Christopher M. Bishop - 2006
However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
SQL Antipatterns
Bill Karwin - 2010
Now he's sharing his collection of antipatterns--the most common errors he's identified in those thousands of requests for help. Most developers aren't SQL experts, and most of the SQL that gets used is inefficient, hard to maintain, and sometimes just plain wrong. This book shows you all the common mistakes, and then leads you through the best fixes. What's more, it shows you what's behind these fixes, so you'll learn a lot about relational databases along the way. Each chapter in this book helps you identify, explain, and correct a unique and dangerous antipattern. The four parts of the book group the antipatterns in terms of logical database design, physical database design, queries, and application development. The chances are good that your application's database layer already contains problems such as Index Shotgun, Keyless Entry, Fear of the Unknown, and Spaghetti Query. This book will help you and your team find them. Even better, it will also show you how to fix them, and how to avoid these and other problems in the future. SQL Antipatterns gives you a rare glimpse into an SQL expert's playbook. Now you can stamp out these common database errors once and for all. Whatever platform or programming language you use, whether you're a junior programmer or a Ph.D., SQL Antipatterns will show you how to design and build databases, how to write better database queries, and how to integrate SQL programming with your application like an expert. You'll also learn the best and most current technology for full-text search, how to design code that is resistant to SQL injection attacks, and other techniques for success.
Agile IT Organization Design: For Digital Transformation and Continuous Delivery
Sriram Narayan - 2015
Now, pioneering ThoughtWorks software engineering expert Sriram Narayan shows how to do just that. Drawing on 15+ years working with leaders in telecommunications, finance, energy, retail, and beyond, he introduces a comprehensive agile approach to "Business-IT Effectiveness" that is as practical as it is valuable. Narayan demonstrates how to integrate agility throughout sales, marketing, product development, engineering, and operations, helping each function deliver more value individually and through its linkages with the rest of the business. Addressing people, process, and technology, he guides you in improving both the dynamic and static aspects of organization design, addressing team structure, accountability structures, organizational norms and culture, knowledge management, and more. Using real examples, Narayan helps you evaluate and improve organization designs to enhance autonomy, mastery, and purpose. You'll learn how to eliminate the specific organizational silos that cause the most problems... improve communication in organizations that claim to be (but aren't really) non-hierarchical... optimize the way you build teams, design office space, and even choose tools. Simply put, Agile IT Organization Design will help you improve improving the performance of any software organization by propagating agile wherever it makes sense and offers value.
The Product Marketing Manager: Responsibilities and Best Practices in a Technology Company
Lucas Weber - 2017
This involves taking detailed and technical product information and distilling it into key marketing and sales messages as well as working among several teams in an organization to plan and execute product releases and launches. This book is a must-have for anyone who works as, or with, a Product Marketing Manager. It not only explains the role but focuses on practical applications of the information presented and ties everything together with entertaining life lessons and anecdotes collected through years of experience by the author as well as interviews with his colleagues and other industry experts. If you are considering a career as a Product Marketing Manager, are new to the profession and looking for guidance and clarification, already have many years of experience in the role and are looking for new inspiration and ideas, or are interested in learning what a Product Marketing Manager colleague of yours is responsible for within your organization, this book is for you.
Invisible Women: Data Bias in a World Designed for Men
Caroline Criado Pérez - 2019
From economic development, to healthcare, to education and public policy, we rely on numbers to allocate resources and make crucial decisions. But because so much data fails to take into account gender, because it treats men as the default and women as atypical, bias and discrimination are baked into our systems. And women pay tremendous costs for this bias, in time, money, and often with their lives.Celebrated feminist advocate Caroline Criado Perez investigates the shocking root cause of gender inequality and research in Invisible Women, diving into women’s lives at home, the workplace, the public square, the doctor’s office, and more. Built on hundreds of studies in the US, the UK, and around the world, and written with energy, wit, and sparkling intelligence, this is a groundbreaking, unforgettable exposé that will change the way you look at the world.
Micro-Isv: From Vision to Reality
Bob Walsh - 2006
As for the latter, are you a programmer and curious about being your own boss? Where do you turn for information? Until now, online and traditional literature havent caught up with the reality of the post-dot com bust.Micro-ISV: From Vision to Reality explains what works and why in today's emerging micro-ISV sector. Currently, thousands of programmers build and deliver great solutions ISV-style, earning success and revenues much larger than you might guess. Written by and for micro-ISVs, with help from some of the leaders of the field, this book takes you beyond just daydreaming to running your own business. It thoroughly explores how it is indeed possible to launch and maintain a small and successful ISV business, and is an ideal read if you're interested in getting started.
Rise of the Data Cloud
Frank Slootman - 2020
Introduction to Computation and Programming Using Python
John V. Guttag - 2013
It provides students with skills that will enable them to make productive use of computational techniques, including some of the tools and techniques of "data science" for using computation to model and interpret data. The book is based on an MIT course (which became the most popular course offered through MIT's OpenCourseWare) and was developed for use not only in a conventional classroom but in in a massive open online course (or MOOC) offered by the pioneering MIT--Harvard collaboration edX.Students are introduced to Python and the basics of programming in the context of such computational concepts and techniques as exhaustive enumeration, bisection search, and efficient approximation algorithms. The book does not require knowledge of mathematics beyond high school algebra, but does assume that readers are comfortable with rigorous thinking and not intimidated by mathematical concepts. Although it covers such traditional topics as computational complexity and simple algorithms, the book focuses on a wide range of topics not found in most introductory texts, including information visualization, simulations to model randomness, computational techniques to understand data, and statistical techniques that inform (and misinform) as well as two related but relatively advanced topics: optimization problems and dynamic programming.Introduction to Computation and Programming Using Python can serve as a stepping-stone to more advanced computer science courses, or as a basic grounding in computational problem solving for students in other disciplines.
Learning Spark: Lightning-Fast Big Data Analysis
Holden Karau - 2013
How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.
Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.
Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell
Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib
Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm
Learn how to deploy interactive, batch, and streaming applications
Connect to data sources including HDFS, Hive, JSON, and S3
Master advanced topics like data partitioning and shared variables
Get Your Hands Dirty on Clean Architecture: A hands-on guide to creating clean web applications with code examples in Java
Tom Hombergs - 2019
Applied Predictive Modeling
Max Kuhn - 2013
Non- mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages. Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms. Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. Addressing practical concerns extends beyond model fitting to topics such as handling class imbalance, selecting predictors, and pinpointing causes of poor model performance-all of which are problems that occur frequently in practice. The text illustrates all parts of the modeling process through many hands-on, real-life examples. And every chapter contains extensive R code f
Forecasting: Principles and Practice
Rob J. Hyndman - 2013
Deciding whether to build another power generation plant in the next five years requires forecasts of future demand. Scheduling staff in a call centre next week requires forecasts of call volumes. Stocking an inventory requires forecasts of stock requirements. Telecommunication routing requires traffic forecasts a few minutes ahead. Whatever the circumstances or time horizons involved, forecasting is an important aid in effective and efficient planning. This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly. Examples use R with many data sets taken from the authors' own consulting experience.