Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Cybersecurity and Cyberwar: What Everyone Needs to Know(r)


P.W. Singer - 2013
    Today, our entire modern way of life, from communication to commerce to conflict, fundamentally depends on the Internet. And the cybersecurity issues that result challenge literally everyone: politicians wrestling with everything from cybercrime to online freedom; generals protecting the nation from new forms of attack, while planning new cyberwars; business executives defending firms from once unimaginable threats, and looking to make money off of them; lawyers and ethicists building new frameworks for right and wrong. Most of all, cybersecurity issues affect us as individuals. We face new questions in everything from our rights and responsibilities as citizens of both the online and real world to simply how to protect ourselves and our families from a new type of danger. And yet, there is perhaps no issue that has grown so important, so quickly, and that touches so many, that remains so poorly understood.In Cybersecurity and CyberWar: What Everyone Needs to Know�, New York Times best-selling author P. W. Singer and noted cyber expert Allan Friedman team up to provide the kind of easy-to-read, yet deeply informative resource book that has been missing on this crucial issue of 21st century life. Written in a lively, accessible style, filled with engaging stories and illustrative anecdotes, the book is structured around the key question areas of cyberspace and its security: how it all works, why it all matters, and what can we do? Along the way, they take readers on a tour of the important (and entertaining) issues and characters of cybersecurity, from the "Anonymous" hacker group and the Stuxnet computer virus to the new cyber units of the Chinese and U.S. militaries. Cybersecurity and CyberWar: What Everyone Needs to Know� is the definitive account on the subject for us all, which comes not a moment too soon.What Everyone Needs to Know� is a registered trademark of Oxford University Press.

Coders at Work: Reflections on the Craft of Programming


Peter Seibel - 2009
    As the words "at work" suggest, Peter Seibel focuses on how his interviewees tackle the day–to–day work of programming, while revealing much more, like how they became great programmers, how they recognize programming talent in others, and what kinds of problems they find most interesting. Hundreds of people have suggested names of programmers to interview on the Coders at Work web site: http://www.codersatwork.com. The complete list was 284 names. Having digested everyone’s feedback, we selected 16 folks who’ve been kind enough to agree to be interviewed:- Frances Allen: Pioneer in optimizing compilers, first woman to win the Turing Award (2006) and first female IBM fellow- Joe Armstrong: Inventor of Erlang- Joshua Bloch: Author of the Java collections framework, now at Google- Bernie Cosell: One of the main software guys behind the original ARPANET IMPs and a master debugger- Douglas Crockford: JSON founder, JavaScript architect at Yahoo!- L. Peter Deutsch: Author of Ghostscript, implementer of Smalltalk-80 at Xerox PARC and Lisp 1.5 on PDP-1- Brendan Eich: Inventor of JavaScript, CTO of the Mozilla Corporation - Brad Fitzpatrick: Writer of LiveJournal, OpenID, memcached, and Perlbal - Dan Ingalls: Smalltalk implementor and designer- Simon Peyton Jones: Coinventor of Haskell and lead designer of Glasgow Haskell Compiler- Donald Knuth: Author of The Art of Computer Programming and creator of TeX- Peter Norvig: Director of Research at Google and author of the standard text on AI- Guy Steele: Coinventor of Scheme and part of the Common Lisp Gang of Five, currently working on Fortress- Ken Thompson: Inventor of UNIX- Jamie Zawinski: Author of XEmacs and early Netscape/Mozilla hackerWhat you’ll learn:How the best programmers in the world do their jobWho is this book for?Programmers interested in the point of view of leaders in the field. Programmers looking for approaches that work for some of these outstanding programmers.

Probabilistic Graphical Models: Principles and Techniques


Daphne Koller - 2009
    The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.

Nabokov's Favorite Word Is Mauve: What the Numbers Reveal About the Classics, Bestsellers, and Our Own Writing


Ben Blatt - 2017
    There’s a famous piece of writing advice—offered by Ernest Hemingway, Stephen King, and myriad writers in between—not to use -ly adverbs like “quickly” or “fitfully.” It sounds like solid advice, but can we actually test it? If we were to count all the -ly adverbs these authors used in their careers, do they follow their own advice compared to other celebrated authors? What’s more, do great books in general—the classics and the bestsellers—share this trait?In Nabokov’s Favorite Word Is Mauve, statistician and journalist Ben Blatt brings big data to the literary canon, exploring the wealth of fun findings that remain hidden in the works of the world’s greatest writers. He assembles a database of thousands of books and hundreds of millions of words, and starts asking the questions that have intrigued curious word nerds and book lovers for generations: What are our favorite authors’ favorite words? Do men and women write differently? Are bestsellers getting dumber over time? Which bestselling writer uses the most clichés? What makes a great opening sentence? How can we judge a book by its cover? And which writerly advice is worth following or ignoring?

Statistics Done Wrong: The Woefully Complete Guide


Alex Reinhart - 2013
    Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.

Python Data Science Handbook: Tools and Techniques for Developers


Jake Vanderplas - 2016
    Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms

Thinking In Numbers: On Life, Love, Meaning, and Math


Daniel Tammet - 2012
    In Tammet's world, numbers are beautiful and mathematics illuminates our lives and minds. Using anecdotes, everyday examples, and ruminations on history, literature, and more, Tammet allows us to share his unique insights and delight in the way numbers, fractions, and equations underpin all our lives. Inspired by the complexity of snowflakes, Anne Boleyn's eleven fingers, or his many siblings, Tammet explores questions such as why time seems to speed up as we age, whether there is such a thing as an average person, and how we can make sense of those we love. Thinking In Numbers will change the way you think about math and fire your imagination to see the world with fresh eyes.

Pattern Classification


David G. Stork - 1973
    Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics.An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.

The Storytelling Animal: How Stories Make Us Human


Jonathan Gottschall - 2012
    We spin fantasies. We devour novels, films, and plays. Even sporting events and criminal trials unfold as narratives. Yet the world of story has long remained an undiscovered and unmapped country. It’s easy to say that humans are “wired” for story, but why?In this delightful and original book, Jonathan Gottschall offers the first unified theory of storytelling. He argues that stories help us navigate life’s complex social problems—just as flight simulators prepare pilots for difficult situations. Storytelling has evolved, like other behaviors, to ensure our survival.Drawing on the latest research in neuroscience, psychology, and evolutionary biology, Gottschall tells us what it means to be a storytelling animal. Did you know that the more absorbed you are in a story, the more it changes your behavior? That all children act out the same kinds of stories, whether they grow up in a slum or a suburb? That people who read more fiction are more empathetic?Of course, our story instinct has a darker side. It makes us vulnerable to conspiracy theories, advertisements, and narratives about ourselves that are more “truthy” than true. National myths can also be terribly dangerous: Hitler’s ambitions were partly fueled by a story.But as Gottschall shows in this remarkable book, stories can also change the world for the better. Most successful stories are moral—they teach us how to live, whether explicitly or implicitly, and bind us together around common values. We know we are master shapers of story. The Storytelling Animal finally reveals how stories shape us.

Metaphors We Live By


George Lakoff - 1980
    Metaphor, the authors explain, is a fundamental mechanism of mind, one that allows us to use what we know about our physical and social experience to provide understanding of countless other subjects. Because such metaphors structure our most basic understandings of our experience, they are "metaphors we live by", metaphors that can shape our perceptions and actions without our ever noticing them.In this updated edition of Lakoff and Johnson's influential book, the authors supply an afterword surveying how their theory of metaphor has developed within the cognitive sciences to become central to the contemporary understanding of how we think and how we express our thoughts in language.

The Psychology of Computer Programming


Gerald M. Weinberg - 1971
    Weinberg adds new insights and highlights the similarities and differences between now and then. Using a conversational style that invites the reader to join him, Weinberg reunites with some of his most insightful writings on the human side of software engineering.Topics include egoless programming, intelligence, psychological measurement, personality factors, motivation, training, social problems on large projects, problem-solving ability, programming language design, team formation, the programming environment, and much more.Dorset House Publishing is proud to make this important text available to new generations of programmers -- and to encourage readers of the first edition to return to its valuable lessons.

Natural Language Processing with Python


Steven Bird - 2009
    With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligenceThis book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.

Whiplash: How to Survive Our Faster Future


Joichi Ito - 2016
    The world is more complex and volatile today than at any other time in our history. The tools of our modern existence are getting faster, cheaper, and smaller at an exponential rate, transforming every aspect of society, from business to culture and from the public sphere to our most private moments. The people who succeed will be the ones who learn to think differently. In Whiplash, Joi Ito and Jeff Howe distill that logic into nine organizing principles for navigating and surviving this tumultuous period: Emergence over AuthorityPull over PushCompasses over MapsRisk over SafetyDisobedience over CompliancePractice over TheoryDiversity over AbilityResilience over StrengthSystems over Objects Filled with incredible case studies and cutting-edge research and philosophies from the MIT Media Lab and beyond, Whiplash will help you adapt and succeed in this unpredictable world.

The Road Ahead


Bill Gates - 1995
    Includes a compact disc which is playable on CD-ROM and audio CD players.