Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering


Steven H. Strogatz - 1994
    The presentation stresses analytical methods, concrete examples, and geometric intuition. A unique feature of the book is its emphasis on applications. These include mechanical vibrations, lasers, biological rhythms, superconducting circuits, insect outbreaks, chemical oscillators, genetic control systems, chaotic waterwheels, and even a technique for using chaos to send secret messages. In each case, the scientific background is explained at an elementary level and closely integrated with mathematical theory.About the Author:Steven Strogatz is in the Center for Applied Mathematics and the Department of Theoretical and Applied Mathematics at Cornell University. Since receiving his Ph.D. from Harvard university in 1986, Professor Strogatz has been honored with several awards, including the E.M. Baker Award for Excellence, the highest teaching award given by MIT.

The Elements of User Experience: User-Centered Design for the Web


Jesse James Garrett - 2002
    This book aims to minimize the complexity of user-centered design for the Web with explanations and illustrations that focus on ideas rather than tools or techniques.

The Fourth Transformation: How Augmented Reality and Artificial Intelligence Change Everything


Robert Scoble - 2016
    

Computers and Intractability: A Guide to the Theory of NP-Completeness


Michael R. Garey - 1979
    Johnson. It was the first book exclusively on the theory of NP-completeness and computational intractability. The book features an appendix providing a thorough compendium of NP-complete problems (which was updated in later printings of the book). The book is now outdated in some respects as it does not cover more recent development such as the PCP theorem. It is nevertheless still in print and is regarded as a classic: in a 2006 study, the CiteSeer search engine listed the book as the most cited reference in computer science literature.

The Big Book of Dashboards: Visualizing Your Data Using Real-World Business Scenarios


Steve Wexler - 2017
    It's great to have theory and evidenced-based research at your disposal, but what will you do when somebody asks you to make your dashboard 'cooler' by adding packed bubbles and donut charts?The expert authors have a combined 30-plus years of hands-on experience helping people in hundreds of organizations build effective visualizations. They have fought many 'best practices' battles and having endured bring an uncommon empathy to help you, the reader of this book, survive and thrive in the data visualization world.A well-designed dashboard can point out risks, opportunities, and more; but common challenges and misconceptions can make your dashboard useless at best, and misleading at worst. The Big Book of Dashboards gives you the tools, guidance, and models you need to produce great dashboards that inform, enlighten, and engage.

The Society of Mind


Marvin Minsky - 1985
    Mirroring his theory, Minsky boldly casts The Society of Mind as an intellectual puzzle whose pieces are assembled along the way. Each chapter -- on a self-contained page -- corresponds to a piece in the puzzle. As the pages turn, a unified theory of the mind emerges, like a mosaic. Ingenious, amusing, and easy to read, The Society of Mind is an adventure in imagination.

Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data into Profitable Insight


Piyanka Jain - 2014
    Nothing could be further from the truth. In Behind Every Good Decision, authors and analytics experts Piyanka Jain and Puneet Sharma demonstrate how professionals at any level can take the information at their disposal and leverage it to make better decisions. The authors’ streamlined frame work demystifies the process of business analytics and helps anyone move from data to decisions in just five steps…using only Excel as a tool. Readers will learn how to: Clarify the business question • Lay out a hypothesis-driven plan • Pull relevant data • Convert it to insights • Make decisions that make an impact Packed with examples and exercises, this refreshingly accessible book explains the four fundamental analytic techniques that can help solve a surprising 80% of all business problems. Business analytics isn’t rocket science—it’s a simple problem-solving tool that can help companies increase revenue, decrease costs, improve products, and delight customers. And who doesn’t want to do that?

Computer Architecture: A Quantitative Approach


John L. Hennessy - 2006
    Today, Intel and other semiconductor firms are abandoning the single fast processor model in favor of multi-core microprocessors--chips that combine two or more processors in a single package. In the fourth edition of "Computer Architecture," the authors focus on this historic shift, increasing their coverage of multiprocessors and exploring the most effective ways of achieving parallelism as the key to unlocking the power of multiple processor architectures. Additionally, the new edition has expanded and updated coverage of design topics beyond processor performance, including power, reliability, availability, and dependability. CD System Requirements"PDF Viewer"The CD material includes PDF documents that you can read with a PDF viewer such as Adobe, Acrobat or Adobe Reader. Recent versions of Adobe Reader for some platforms are included on the CD. "HTML Browser"The navigation framework on this CD is delivered in HTML and JavaScript. It is recommended that you install the latest version of your favorite HTML browser to view this CD. The content has been verified under Windows XP with the following browsers: Internet Explorer 6.0, Firefox 1.5; under Mac OS X (Panther) with the following browsers: Internet Explorer 5.2, Firefox 1.0.6, Safari 1.3; and under Mandriva Linux 2006 with the following browsers: Firefox 1.0.6, Konqueror 3.4.2, Mozilla 1.7.11. The content is designed to be viewed in a browser window that is at least 720 pixels wide. You may find the content does not display well if your display is not set to at least 1024x768 pixel resolution. "Operating System"This CD can be used under any operating system that includes an HTML browser and a PDF viewer. This includes Windows, Mac OS, and most Linux and Unix systems. Increased coverage on achieving parallelism with multiprocessors. Case studies of latest technology from industry including the Sun Niagara Multiprocessor, AMD Opteron, and Pentium 4. Three review appendices, included in the printed volume, review the basic and intermediate principles the main text relies upon. Eight reference appendices, collected on the CD, cover a range of topics including specific architectures, embedded systems, application specific processors--some guest authored by subject experts.

The Nostalgia Nerd's Retro Tech: Computer, Consoles & Games


Peter Leigh - 2018
    Remember what a wild frontier the early days of home gaming were? Manufacturers releasing new consoles at a breakneck pace; developers creating games that kept us up all night, then going bankrupt the next day; and what self-respecting kid didn't beg their parents for an Atari or a Nintendo? This explosion of computers, consoles, and games was genuinely unlike anything the tech world has seen before or since.This thoroughly researched and geeky trip down memory lane pulls together the most entertaining stories from this dynamic era, and brings you the classic tech that should never be forgotten.

Planning for Big Data


Edd Wilder-James - 2004
    From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Computational Complexity


Christos H. Papadimitriou - 1993
    It offers a comprehensive and accessible treatment of the theory of algorithms and complexity—the elegant body of concepts and methods developed by computer scientists over the past 30 years for studying the performance and limitations of computer algorithms. The book is self-contained in that it develops all necessary mathematical prerequisites from such diverse fields such as computability, logic, number theory and probability.

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora