Book picks similar to
Creating a Data-Driven Enterprise with DataOps by Ashish Thusoo
big-data
cs-or-tech-related
data-science
data-science-engineering
The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures
Dona M. Wong - 2009
Yet information graphics is rarely taught in schools or is the focus of on-the-job training. Now, for the first time, Dona M. Wong, a student of the information graphics pioneer Edward Tufte, makes this material available for all of us. In this book, you will learn:to choose the best chart that fits your data;the most effective way to communicate with decision makers when you have five minutes of their time;how to chart currency fluctuations that affect global business;how to use color effectively;how to make a graphic “colorful” even if only black and white are available.The book is organized in a series of mini-workshops backed up with illustrated examples, so not only will you learn what works and what doesn’t but also you can see the dos and don’ts for yourself. This is an invaluable reference work for students and professional in all fields.
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Seth Stephens-Davidowitz - 2017
This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable.Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women?Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world.
Reinforcement Learning: An Introduction
Richard S. Sutton - 1998
Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.
Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)
Jiawei Han - 2000
Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site
Data Analytics Made Accessible
Anil Maheshwari - 2014
It is a conversational book that feels easy and informative. This short and lucid book covers everything important, with concrete examples, and invites the reader to join this field. The chapters in the book are organized for a typical one-semester course. The book contains case-lets from real-world stories at the beginning of every chapter. There is a running case study across the chapters as exercises. This book is designed to provide a student with the intuition behind this evolving area, along with a solid toolset of the major data mining techniques and platforms. Students across a variety of academic disciplines, including business, computer science, statistics, engineering, and others are attracted to the idea of discovering new insights and ideas from data. This book can also be gainfully used by executives, managers, analysts, professors, doctors, accountants, and other professionals to learn how to make sense of the data coming their way. This is a lucid flowing book that one can finish in one sitting, or can return to it again and again for insights and techniques. Table of Contents Chapter 1: Wholeness of Business Intelligence and Data Mining Chapter 2: Business Intelligence Concepts & Applications Chapter 3: Data Warehousing Chapter 4: Data Mining Chapter 5: Decision Trees Chapter 6: Regression Models Chapter 7: Artificial Neural Networks Chapter 8: Cluster Analysis Chapter 9: Association Rule Mining Chapter 10: Text Mining Chapter 11: Web Mining Chapter 12: Big Data Chapter 13: Data Modeling Primer Appendix: Data Mining Tutorial using Weka
Advances in Financial Machine Learning
Marcos López de Prado - 2018
Today, ML algorithms accomplish tasks that - until recently - only expert humans could perform. And finance is ripe for disruptive innovations that will transform how the following generations understand money and invest.In the book, readers will learn how to:Structure big data in a way that is amenable to ML algorithms Conduct research with ML algorithms on big data Use supercomputing methods and back test their discoveries while avoiding false positives Advances in Financial Machine Learning addresses real life problems faced by practitioners every day, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their individual setting.Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance.
Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema
Lawrence Corr - 2011
This book describes BEAM✲, an agile approach to dimensional modeling, for improving communication between data warehouse designers, BI stakeholders and the whole DW/BI development team. BEAM✲ provides tools and techniques that will encourage DW/BI designers and developers to move away from their keyboards and entity relationship based tools and model interactively with their colleagues. The result is everyone thinks dimensionally from the outset! Developers understand how to efficiently implement dimensional modeling solutions. Business stakeholders feel ownership of the data warehouse they have created, and can already imagine how they will use it to answer their business questions. Within this book, you will learn: ✲ Agile dimensional modeling using Business Event Analysis & Modeling (BEAM✲) ✲ Modelstorming: data modeling that is quicker, more inclusive, more productive, and frankly more fun! ✲ Telling dimensional data stories using the 7Ws (who, what, when, where, how many, why and how) ✲ Modeling by example not abstraction; using data story themes, not crow's feet, to describe detail ✲ Storyboarding the data warehouse to discover conformed dimensions and plan iterative development ✲ Visual modeling: sketching timelines, charts and grids to model complex process measurement - simply ✲ Agile design documentation: enhancing star schemas with BEAM✲ dimensional shorthand notation ✲ Solving difficult DW/BI performance and usability problems with proven dimensional design patterns Lawrence Corr is a data warehouse designer and educator. As Principal of DecisionOne Consulting, he helps clients to review and simplify their data warehouse designs, and advises vendors on visual data modeling techniques. He regularly teaches agile dimensional modeling courses worldwide and has taught dimensional DW/BI skills to thousands of students. Jim Stagnitto is a data warehouse and master data management architect specializing in the healthcare, financial services, and information service industries. He is the founder of the data warehousing and data mining consulting firm Llumino.
Forecasting: Principles and Practice
Rob J. Hyndman - 2013
Deciding whether to build another power generation plant in the next five years requires forecasts of future demand. Scheduling staff in a call centre next week requires forecasts of call volumes. Stocking an inventory requires forecasts of stock requirements. Telecommunication routing requires traffic forecasts a few minutes ahead. Whatever the circumstances or time horizons involved, forecasting is an important aid in effective and efficient planning. This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly. Examples use R with many data sets taken from the authors' own consulting experience.
Requirements Engineering Fundamentals: A Study Guide for the Certified Professional for Requirements Engineering Exam - Foundation Level - IREB compliant
Klaus Pohl - 2009
In order to ensure a high level of knowledge and training, the International Requirements Engineering Board (IREB) worked out the training concept “Certified Professional for Requirements Engineering”, which defines a requirements engineer’s practical skills on different training levels. The book covers the different subjects of the curriculum for the “Certified Professional for Requirements Engineering” (CPRE) defined by the International Requirements Engineering Board (IREB). It supports its readers in preparing for the test to achieve the “Foundation Level” of the CPRE.
Deep Learning with Python
François Chollet - 2017
It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
Hadley Wickham - 2016
This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:
Wrangle—transform your datasets into a form convenient for analysis
Program—learn powerful R tools for solving data problems with greater clarity and ease
Explore—examine your data, generate hypotheses, and quickly test them
Model—provide a low-dimensional summary that captures true "signals" in your dataset
Communicate—learn R Markdown for integrating prose, code, and results
Windows 8.1 For Dummies
Andy Rathbone - 2013
Parts cover: Windows 8.1 Stuff Everybody Thinks You Already Know - an introduction to the dual interfaces, basic mechanics, file storage, and instruction on how to get the free upgrade to Windows 8.1.Working with Programs, Apps and Files - the basics of finding and launching apps, getting help, and printingGetting Things Done on the Internet - instructions for connecting a Windows 8.1 device, using web and social apps, and maintaining privacyCustomizing and Upgrading Windows 8.1 - Windows 8.1 offers big changes to what a user can customize on the OS. This section shows how to manipulate app tiles, give Windows the look you in, set up boot-to-desktop capabilities, connect to a network, and create user accounts.Music, Photos and Movies - Windows 8.1 offers new apps and capabilities for working with onboard and online media, all covered in this chapterHelp! - includes guidance on how to fix common problems, interpret strange messages, move files to a new PC, and use the built-in help systemThe Part of Tens - quick tips for avoiding common annoyances and working with Windows 8.1 on a touch device
Visualizing Data: Exploring and Explaining Data with the Processing Environment
Ben Fry - 2007
Using a downloadable programming environment developed by the author, Visualizing Data demonstrates methods for representing data accurately on the Web and elsewhere, complete with user interaction, animation, and more. How do the 3.1 billion A, C, G and T letters of the human genome compare to those of a chimp or a mouse? What do the paths that millions of visitors take through a web site look like? With Visualizing Data, you learn how to answer complex questions like these with thoroughly interactive displays. We're not talking about cookie-cutter charts and graphs. This book teaches you how to design entire interfaces around large, complex data sets with the help of a powerful new design and prototyping tool called "Processing". Used by many researchers and companies to convey specific data in a clear and understandable manner, the Processing beta is available free. With this tool and Visualizing Data as a guide, you'll learn basic visualization principles, how to choose the right kind of display for your purposes, and how to provide interactive features that will bring users to your site over and over. This book teaches you:The seven stages of visualizing data -- acquire, parse, filter, mine, represent, refine, and interact How all data problems begin with a question and end with a narrative construct that provides a clear answer without extraneous details Several example projects with the code to make them work Positive and negative points of each representation discussed. The focus is on customization so that each one best suits what you want to convey about your data set The book does not provide ready-made "visualizations" that can be plugged into any data set. Instead, with chapters divided by types of data rather than types of display, you'll learn how each visualization conveys the unique properties of the data it represents -- why the data was collected, what's interesting about it, and what stories it can tell. Visualizing Data teaches you how to answer questions, not simply display information.
Text Mining with R: A Tidy Approach
Julia Silge - 2017
With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media.Learn how to apply the tidy text format to NLPUse sentiment analysis to mine the emotional content of textIdentify a document's most important terms with frequency measurementsExplore relationships and connections between words with the ggraph and widyr packagesConvert back and forth between R's tidy and non-tidy text formatsUse topic modeling to classify document collections into natural groupsExamine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages
Java SE 6: The Complete Reference
Herbert Schildt - 2006
He includes information on Java Platform Standard Edition 6 (Java SE 6) and offers complete coverage of the Java language, its syntax, keywords, and fundamental programming principles.