The Art of Data Science: A Guide for Anyone Who Works with Data


Roger D. Peng - 2015
    The authors have extensive experience both managing data analysts and conducting their own data analyses, and have carefully observed what produces coherent results and what fails to produce useful insights into data. This book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science.

Mental Models: Aligning Design Strategy with Human Behavior


Indi Young - 2008
    One of the best ways is to understand users' reasons for doing things. Mental Models gives you the tools to help you grasp, and design for, those reasons. Adaptive Path co-founder Indi Young has written a roll-up-your-sleeves book for designers, managers, and anyone else interested in making design strategic, and successful.

The Information Design Handbook


Jenn Visocky O'Grady - 2008
    The Information Design Handbook celebrates graphics that are exemplars of communication and esthetics, and reveals the thought processes and design skills behind them. This comprehensive guide to creating information graphics is packed with essential design principles, case studies, color palettes, trouble-shooting tips, and much more. Designers will learn to achieve graphics that are visually striking yet concise and supremely funcitional with this must-have resource.

How We Decide


Jonah Lehrer - 2009
    But as scientists break open the mind’s black box with the latest tools of neuroscience, they’re discovering that this is not how the mind works. Our best decisions are a finely tuned blend of both feeling and reason—and the precise mix depends on the situation. When buying a house, for example, it’s best to let our unconscious mull over the many variables. But when we’re picking a stock, intuition often leads us astray. The trick is to determine when to use the different parts of the brain, and to do this, we need to think harder (and smarter) about how we think.Jonah Lehrer arms us with the tools we need, drawing on cutting-edge research as well as the real-world experiences of a wide range of “deciders”—from airplane pilots and hedge fund investors to serial killers and poker players.Lehrer shows how people are taking advantage of the new science to make better television shows, win more football games, and improve military intelligence. His goal is to answer two questions that are of interest to just about anyone, from CEOs to firefighters: How does the human mind make decisions? And how can we make those decisions better?

Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

R Graphics Cookbook: Practical Recipes for Visualizing Data


Winston Chang - 2012
    Each recipe tackles a specific problem with a solution you can apply to your own project, and includes a discussion of how and why the recipe works.Most of the recipes use the ggplot2 package, a powerful and flexible way to make graphs in R. If you have a basic understanding of the R language, you're ready to get started.Use R's default graphics for quick exploration of dataCreate a variety of bar graphs, line graphs, and scatter plotsSummarize data distributions with histograms, density curves, box plots, and other examplesProvide annotations to help viewers interpret dataControl the overall appearance of graphicsRender data groups alongside each other for easy comparisonUse colors in plotsCreate network graphs, heat maps, and 3D scatter plotsStructure data for graphing

Yes!: 50 Scientifically Proven Ways to Be Persuasive


Noah J. Goldstein - 2008
    But what makes people say yes to our requests? Persuasion is not only an art, it is also a science, and researchers who study it have uncovered a series of hidden rules for moving people in your direction. Based on more than sixty years of research into the psychology of persuasion, Yes! reveals fifty simple but remarkably effective strategies that will make you much more persuasive at work and in your personal life, too.Cowritten by the world's most quoted expert on influence, Professor Robert Cialdini, Yes! presents dozens of surprising discoveries from the science of persuasion in short, enjoyable, and insightful chapters that you can apply immediately to become a more effective persuader. Why did a sign pointing out the problem of vandalism in the Petrified Forest National Park actually increase the theft of pieces of petrified wood? Why did sales of jam multiply tenfold when consumers were offered many fewer flavors? Why did people prefer a Mercedes immediately after giving reasons why they prefer a BMW? What simple message on cards left in hotel rooms greatly increased the number of people who behaved in environmentally friendly ways?Often counterintuitive, the findings presented in Yes! will steer you away from common pitfalls while empowering you with little-known but proven wisdom.Whether you are in advertising, marketing, management, or sales, or just curious about how to be more influential in everyday life, Yes! shows how making small, scientifically proven changes to your approach can have a dramatic effect on your persuasive powers.

Mining of Massive Datasets


Anand Rajaraman - 2011
    This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.

Steal Like an Artist: 10 Things Nobody Told You About Being Creative


Austin Kleon - 2012
    That’s the message from Austin Kleon, a young writer and artist who knows that creativity is everywhere, creativity is for everyone. A manifesto for the digital age, Steal Like an Artist is a guide whose positive message, graphic look and illustrations, exercises, and examples will put readers directly in touch with their artistic side.

User Friendly: How the Hidden Rules of Design Are Changing the Way We Live, Work, and Play


Cliff Kuang - 2019
    Spanning over a century of sweeping changes, from women's rights to the Great Depression to World War II to the rise of the digital era, this book unpacks the ways in which the world has been--and continues to be--remade according to the principles of the once-obscure discipline of user-experience design.In this essential text, Kuang and Fabricant map the hidden rules of the designed world and shed light on how those rules have caused our world to change--an underappreciated but essential history that's pieced together for the first time. Combining the expertise and insight of a leading journalist and a pioneering designer, User Friendly provides a definitive, thoughtful, and practical perspective on a topic that has rapidly gone from arcane to urgent to inescapable. In User Friendly, Kuang and Fabricant tell the whole story for the first time--and you'll never interact with technology the same way again.

Natural Language Processing with Python


Steven Bird - 2009
    With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligenceThis book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Learning Python


Mark Lutz - 2003
    Python is considered easy to learn, but there's no quicker way to mastery of the language than learning from an expert teacher. This edition of "Learning Python" puts you in the hands of two expert teachers, Mark Lutz and David Ascher, whose friendly, well-structured prose has guided many a programmer to proficiency with the language. "Learning Python," Second Edition, offers programmers a comprehensive learning tool for Python and object-oriented programming. Thoroughly updated for the numerous language and class presentation changes that have taken place since the release of the first edition in 1999, this guide introduces the basic elements of the latest release of Python 2.3 and covers new features, such as list comprehensions, nested scopes, and iterators/generators. Beyond language features, this edition of "Learning Python" also includes new context for less-experienced programmers, including fresh overviews of object-oriented programming and dynamic typing, new discussions of program launch and configuration options, new coverage of documentation sources, and more. There are also new use cases throughout to make the application of language features more concrete. The first part of "Learning Python" gives programmers all the information they'll need to understand and construct programs in the Python language, including types, operators, statements, classes, functions, modules and exceptions. The authors then present more advanced material, showing how Python performs common tasks by offering real applications and the libraries available for those applications. Each chapter ends with a series of exercises that will test your Python skills and measure your understanding."Learning Python," Second Edition is a self-paced book that allows readers to focus on the core Python language in depth. As you work through the book, you'll gain a deep and complete understanding of the Python language that will help you to understand the larger application-level examples that you'll encounter on your own. If you're interested in learning Python--and want to do so quickly and efficiently--then "Learning Python," Second Edition is your best choice.

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark


Holden Karau - 2017
    But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.With this book, you'll explore:How Spark SQL's new interfaces improve performance over SQL's RDD data structureThe choice between data joins in Core Spark and Spark SQLTechniques for getting the most out of standard RDD transformationsHow to work around performance issues in Spark's key/value pair paradigmWriting high-performance Spark code without Scala or the JVMHow to test for functionality and performance when applying suggested improvementsUsing Spark MLlib and Spark ML machine learning librariesSpark's Streaming components and external community packages

Introduction to Information Retrieval


Christopher D. Manning - 2008
    Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.

Web ReDesign 2.0: Workflow that Works


Kelly Goto - 2001
    So much so, in fact, that the 12-month design cycles cited in the last edition have shrunk to 6 or even 3 months today. Which is why, more than ever, you need a smart, practical guide that demonstrates how to plan, budget, organize, and manage your Web redesign - or even you initial design - projects from conceptualization to launch. This volume delivers! In these pages Web designer extraordinaire Kelly Goto and coauthor Emily Cotler have distilled their real-world experience into a sound approach to Web redesign workflow that is as much about business priorities as it is about good design. By focusing on where these priorities intersect, Kelly and Emily get straight to the heart of the matter. Each chapter includes a case study that illustrates a key step in the process, and you'll find a plethora of forms, checklists, and worksheets that help you put knowledge into action.This is an AIGA Design Press book published under Peachpit's New Riders imprint in partnership with AIGA.