Book picks similar to
Big data @ work : dispelling the myths, uncovering the opportunities by Thomas H. Davenport
business
technology
data-science
non-fiction
Doing Data Science
Cathy O'Neil - 2013
But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.
Big Data: A Revolution That Will Transform How We Live, Work, and Think
Viktor Mayer-Schönberger - 2013
“Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Ralph Kimball - 1996
Here is a complete library of dimensional modeling techniques-- the most comprehensive collection ever written. Greatly expanded to cover both basic and advanced techniques for optimizing data warehouse design, this second edition to Ralph Kimball's classic guide is more than sixty percent updated.The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including:* Retail sales and e-commerce* Inventory management* Procurement* Order management* Customer relationship management (CRM)* Human resources management* Accounting* Financial services* Telecommunications and utilities* Education* Transportation* Health care and insuranceBy the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.This book is also available as part of the Kimball's Data Warehouse Toolkit Classics Box Set (ISBN: 9780470479575) with the following 3 books:The Data Warehouse Toolkit, 2nd Edition (9780471200246)The Data Warehouse Lifecycle Toolkit, 2nd Edition (9780470149775)The Data Warehouse ETL Toolkit (9780764567575)
The Filter Bubble: What the Internet is Hiding From You
Eli Pariser - 2011
Instead of giving you the most broadly popular result, Google now tries to predict what you are most likely to click on. According to MoveOn.org board president Eli Pariser, Google's change in policy is symptomatic of the most significant shift to take place on the Web in recent years - the rise of personalization. In this groundbreaking investigation of the new hidden Web, Pariser uncovers how this growing trend threatens to control how we consume and share information as a society-and reveals what we can do about it.Though the phenomenon has gone largely undetected until now, personalized filters are sweeping the Web, creating individual universes of information for each of us. Facebook - the primary news source for an increasing number of Americans - prioritizes the links it believes will appeal to you so that if you are a liberal, you can expect to see only progressive links. Even an old-media bastion like "The Washington Post" devotes the top of its home page to a news feed with the links your Facebook friends are sharing. Behind the scenes a burgeoning industry of data companies is tracking your personal information to sell to advertisers, from your political leanings to the color you painted your living room to the hiking boots you just browsed on Zappos.In a personalized world, we will increasingly be typed and fed only news that is pleasant, familiar, and confirms our beliefs - and because these filters are invisible, we won't know what is being hidden from us. Our past interests will determine what we are exposed to in the future, leaving less room for the unexpected encounters that spark creativity, innovation, and the democratic exchange of ideas.While we all worry that the Internet is eroding privacy or shrinking our attention spans, Pariser uncovers a more pernicious and far-reaching trend on the Internet and shows how we can - and must - change course. With vivid detail and remarkable scope, The Filter Bubble reveals how personalization undermines the Internet's original purpose as an open platform for the spread of ideas and could leave us all in an isolated, echoing world.
Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone
Satya Nadella - 2017
It’s about how people, organizations and societies can and must hit refresh—transform—in their persistent quest for new energy, new ideas, relevance and renewal. At the core, it’s about us humans and our unique qualities, like empathy, which will become ever more valuable in a world where the torrent of technology will disrupt like never before. As much a humanist as a technologist, Nadella defines his mission and that of the company he leads as empowering every person and every organization on the planet to achieve more.
Hands-On Machine Learning with Scikit-Learn and TensorFlow
Aurélien Géron - 2017
Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details
Introduction to Machine Learning with Python: A Guide for Data Scientists
Andreas C. Müller - 2015
If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.You'll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Muller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.With this book, you'll learn:Fundamental concepts and applications of machine learningAdvantages and shortcomings of widely used machine learning algorithmsHow to represent data processed by machine learning, including which data aspects to focus onAdvanced methods for model evaluation and parameter tuningThe concept of pipelines for chaining models and encapsulating your workflowMethods for working with text data, including text-specific processing techniquesSuggestions for improving your machine learning and data science skills
Thinking in Systems: A Primer
Donella H. Meadows - 2008
Edited by the Sustainability Institute’s Diana Wright, this essential primer brings systems thinking out of the realm of computers and equations and into the tangible world, showing readers how to develop the systems-thinking skills that thought leaders across the globe consider critical for 21st-century life.Some of the biggest problems facing the world—war, hunger, poverty, and environmental degradation—are essentially system failures. They cannot be solved by fixing one piece in isolation from the others, because even seemingly minor details have enormous power to undermine the best efforts of too-narrow thinking.While readers will learn the conceptual tools and methods of systems thinking, the heart of the book is grander than methodology. Donella Meadows was known as much for nurturing positive outcomes as she was for delving into the science behind global dilemmas. She reminds readers to pay attention to what is important, not just what is quantifiable, to stay humble, and to stay a learner.In a world growing ever more complicated, crowded, and interdependent, Thinking in Systems helps readers avoid confusion and helplessness, the first step toward finding proactive and effective solutions.
Think Like a Programmer: An Introduction to Creative Problem Solving
V. Anton Spraul - 2012
In this one-of-a-kind text, author V. Anton Spraul breaks down the ways that programmers solve problems and teaches you what other introductory books often ignore: how to Think Like a Programmer. Each chapter tackles a single programming concept, like classes, pointers, and recursion, and open-ended exercises throughout challenge you to apply your knowledge. You'll also learn how to:Split problems into discrete components to make them easier to solve Make the most of code reuse with functions, classes, and libraries Pick the perfect data structure for a particular job Master more advanced programming tools like recursion and dynamic memory Organize your thoughts and develop strategies to tackle particular types of problems Although the book's examples are written in C++, the creative problem-solving concepts they illustrate go beyond any particular language; in fact, they often reach outside the realm of computer science. As the most skillful programmers know, writing great code is a creative art—and the first step in creating your masterpiece is learning to Think Like a Programmer.
Designing Data-Intensive Applications
Martin Kleppmann - 2015
Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures
Data Analytics Made Accessible
Anil Maheshwari - 2014
It is a conversational book that feels easy and informative. This short and lucid book covers everything important, with concrete examples, and invites the reader to join this field. The chapters in the book are organized for a typical one-semester course. The book contains case-lets from real-world stories at the beginning of every chapter. There is a running case study across the chapters as exercises. This book is designed to provide a student with the intuition behind this evolving area, along with a solid toolset of the major data mining techniques and platforms. Students across a variety of academic disciplines, including business, computer science, statistics, engineering, and others are attracted to the idea of discovering new insights and ideas from data. This book can also be gainfully used by executives, managers, analysts, professors, doctors, accountants, and other professionals to learn how to make sense of the data coming their way. This is a lucid flowing book that one can finish in one sitting, or can return to it again and again for insights and techniques. Table of Contents Chapter 1: Wholeness of Business Intelligence and Data Mining Chapter 2: Business Intelligence Concepts & Applications Chapter 3: Data Warehousing Chapter 4: Data Mining Chapter 5: Decision Trees Chapter 6: Regression Models Chapter 7: Artificial Neural Networks Chapter 8: Cluster Analysis Chapter 9: Association Rule Mining Chapter 10: Text Mining Chapter 11: Web Mining Chapter 12: Big Data Chapter 13: Data Modeling Primer Appendix: Data Mining Tutorial using Weka
Web Analytics 2.0: The Art of Online Accountability & Science of Customer Centricity [With CDROM]
Avinash Kaushik - 2009
"Web Analytics 2.0" presents a new framework that will permanently change how you think about analytics. It provides specific recommendations for creating an actionable strategy, applying analytical techniques correctly, solving challenges such as measuring social media and multichannel campaigns, achieving optimal success by leveraging experimentation, and employing tactics for truly listening to your customers. The book will help your organization become more data driven while you become a super analysis ninja Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.
Information Dashboard Design: The Effective Visual Communication of Data
Stephen Few - 2006
Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.
The Mythical Man-Month: Essays on Software Engineering
Frederick P. Brooks Jr. - 1975
With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."
The Manager's Path: A Guide for Tech Leaders Navigating Growth and Change
Camille Fournier - 2017
Tech companies in general lack the experience, tools, texts, and frameworks to do it well. And the handful of books that share tips and tricks of engineering management don t explain how to supervise employees in the face of growth and change.In this book, author Camille Fournier takes you through the stages of technical management, from mentoring interns to working with the senior staff. You ll get actionable advice for approaching various obstacles in your path, whether you re a new manager, a mentor, or a more experienced leader looking for fresh advice. Pick up this book and learn how to become a better manager and leader in your organization. * Discover how to manage small teams and large/multi-level teams * Understand how to build and bootstrap a unifying culture in teams * Deal with people problems and learn how to mentor other managers and new leaders * Learn how to manage yourself: avoid common pitfalls that challenge many leaders * Obtain several practices that you can incorporate and practice along the way