Numsense! Data Science for the Layman: No Math Added


Annalyn Ng - 2017
    Sold in over 85 countries and translated into more than 5 languages.---------------Want to get started on data science?Our promise: no math added.This book has been written in layman's terms as a gentle introduction to data science and its algorithms. Each algorithm has its own dedicated chapter that explains how it works, and shows an example of a real-world application. To help you grasp key concepts, we stick to intuitive explanations and visuals.Popular concepts covered include:- A/B Testing- Anomaly Detection- Association Rules- Clustering- Decision Trees and Random Forests- Regression Analysis- Social Network Analysis- Neural NetworksFeatures:- Intuitive explanations and visuals- Real-world applications to illustrate each algorithm- Point summaries at the end of each chapter- Reference sheets comparing the pros and cons of algorithms- Glossary list of commonly-used termsWith this book, we hope to give you a practical understanding of data science, so that you, too, can leverage its strengths in making better decisions.

Mastering Regular Expressions


Jeffrey E.F. Friedl - 1997
    They are now standard features in a wide range of languages and popular tools, including Perl, Python, Ruby, Java, VB.NET and C# (and any language using the .NET Framework), PHP, and MySQL.If you don't use regular expressions yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage. If you think you know all you need to know about regularexpressions, this book is a stunning eye-opener.As this book shows, a command of regular expressions is an invaluable skill. Regular expressions allow you to code complex and subtle text processing that you never imagined could be automated. Regular expressions can save you time and aggravation. They can be used to craft elegant solutions to a wide range of problems. Once you've mastered regular expressions, they'll become an invaluable part of your toolkit. You will wonder how you ever got by without them.Yet despite their wide availability, flexibility, and unparalleled power, regular expressions are frequently underutilized. Yet what is power in the hands of an expert can be fraught with peril for the unwary. Mastering Regular Expressions will help you navigate the minefield to becoming an expert and help you optimize your use of regular expressions.Mastering Regular Expressions, Third Edition, now includes a full chapter devoted to PHP and its powerful and expressive suite of regular expression functions, in addition to enhanced PHP coverage in the central "core" chapters. Furthermore, this edition has been updated throughout to reflect advances in other languages, including expanded in-depth coverage of Sun's java.util.regex package, which has emerged as the standard Java regex implementation.Topics include:A comparison of features among different versions of many languages and toolsHow the regular expression engine worksOptimization (major savings available here!)Matching just what you want, but not what you don't wantSections and chapters on individual languagesWritten in the lucid, entertaining tone that makes a complex, dry topic become crystal-clear to programmers, and sprinkled with solutions to complex real-world problems, Mastering Regular Expressions, Third Edition offers a wealth information that you can put to immediateuse.Reviews of this new edition and the second edition: "There isn't a better (or more useful) book available on regular expressions."--Zak Greant, Managing Director, eZ Systems"A real tour-de-force of a book which not only covers the mechanics of regexes in extraordinary detail but also talks about efficiency and the use of regexes in Perl, Java, and .NET...If you use regular expressions as part of your professional work (even if you already have a good book on whatever language you're programming in) I would strongly recommend this book to you."--Dr. Chris Brown, Linux Format"The author does an outstanding job leading the reader from regexnovice to master. The book is extremely easy to read and chock full ofuseful and relevant examples...Regular expressions are valuable toolsthat every developer should have in their toolbox. Mastering RegularExpressions is the definitive guide to the subject, and an outstandingresource that belongs on every programmer's bookshelf. Ten out of TenHorseshoes."--Jason Menard, Java Ranch

All of Statistics: A Concise Course in Statistical Inference


Larry Wasserman - 2003
    But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.

The Way to Go: A Thorough Introduction to the Go Programming Language


Ivo Balbaert - 2012
    "

Machine Learning: A Probabilistic Perspective


Kevin P. Murphy - 2012
    Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python


Scott Hartshorn - 2016
    They are typically used to categorize something based on other data that you have. The purpose of this book is to help you understand how Random Forests work, as well as the different options that you have when using them to analyze a problem. Additionally, since Decision Trees are a fundamental part of Random Forests, this book explains how they work. This book is focused on understanding Random Forests at the conceptual level. Knowing how they work, why they work the way that they do, and what options are available to improve results. This book covers how Random Forests work in an intuitive way, and also explains the equations behind many of the functions, but it only has a small amount of actual code (in python). This book is focused on giving examples and providing analogies for the most fundamental aspects of how random forests and decision trees work. The reason is that those are easy to understand and they stick with you. There are also some really interesting aspects of random forests, such as information gain, feature importances, or out of bag error, that simply cannot be well covered without diving into the equations of how they work. For those the focus is providing the information in a straight forward and easy to understand way.

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers


John MacCormick - 2012
    A simple web search picks out a handful of relevant needles from the world's biggest haystack: the billions of pages on the World Wide Web. Uploading a photo to Facebook transmits millions of pieces of information over numerous error-prone network links, yet somehow a perfect copy of the photo arrives intact. Without even knowing it, we use public-key cryptography to transmit secret information like credit card numbers; and we use digital signatures to verify the identity of the websites we visit. How do our computers perform these tasks with such ease? This is the first book to answer that question in language anyone can understand, revealing the extraordinary ideas that power our PCs, laptops, and smartphones. Using vivid examples, John MacCormick explains the fundamental "tricks" behind nine types of computer algorithms, including artificial intelligence (where we learn about the "nearest neighbor trick" and "twenty questions trick"), Google's famous PageRank algorithm (which uses the "random surfer trick"), data compression, error correction, and much more. These revolutionary algorithms have changed our world: this book unlocks their secrets, and lays bare the incredible ideas that our computers use every day.

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy


George Gilder - 2018
    Gilder says or writes is ever delivered at anything less than the fullest philosophical decibel... Mr. Gilder sounds less like a tech guru than a poet, and his words tumble out in a romantic cascade." “Google’s algorithms assume the world’s future is nothing more than the next moment in a random process. George Gilder shows how deep this assumption goes, what motivates people to make it, and why it’s wrong: the future depends on human action.” — Peter Thiel, founder of PayPal and Palantir Technologies and author of Zero to One: Notes on Startups, or How to Build the Future The Age of Google, built on big data and machine intelligence, has been an awesome era. But it’s coming to an end. In Life after Google, George Gilder—the peerless visionary of technology and culture—explains why Silicon Valley is suffering a nervous breakdown and what to expect as the post-Google age dawns. Google’s astonishing ability to “search and sort” attracts the entire world to its search engine and countless other goodies—videos, maps, email, calendars….And everything it offers is free, or so it seems. Instead of paying directly, users submit to advertising. The system of “aggregate and advertise” works—for a while—if you control an empire of data centers, but a market without prices strangles entrepreneurship and turns the Internet into a wasteland of ads. The crisis is not just economic. Even as advances in artificial intelligence induce delusions of omnipotence and transcendence, Silicon Valley has pretty much given up on security. The Internet firewalls supposedly protecting all those passwords and personal information have proved hopelessly permeable. The crisis cannot be solved within the current computer and network architecture. The future lies with the “cryptocosm”—the new architecture of the blockchain and its derivatives. Enabling cryptocurrencies such as bitcoin and ether, NEO and Hashgraph, it will provide the Internet a secure global payments system, ending the aggregate-and-advertise Age of Google. Silicon Valley, long dominated by a few giants, faces a “great unbundling,” which will disperse computer power and commerce and transform the economy and the Internet. Life after Google is almost here.   For fans of "Wealth and Poverty," "Knowledge and Power," and "The Scandal of Money."

Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites


Matthew A. Russell - 2011
    You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.Get a straightforward synopsis of the social web landscapeUse adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+Learn how to employ easy-to-use Python tools to slice and dice the data you collectExplore social connections in microformats with the XHTML Friends NetworkApply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detectionBuild interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data." --Alex Martelli, Senior Staff Engineer, Google

The Fourth Paradigm: Data-Intensive Scientific Discovery


Tony Hey - 2009
    Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

Star Schema the Complete Reference


Christopher Adamson - 2010
    Star Schema: The Complete Reference offers in-depth coverage of design principles and their underlying rationales. Organized around design concepts and illustrated with detailed examples, this is a step-by-step guidebook for beginners and a comprehensive resource for experts.This all-inclusive volume begins with dimensional design fundamentals and shows how they fit into diverse data warehouse architectures, including those of W.H. Inmon and Ralph Kimball. The book progresses through a series of advanced techniques that help you address real-world complexity, maximize performance, and adapt to the requirements of BI and ETL software products. You are furnished with design tasks and deliverables that can be incorporated into any project, regardless of architecture or methodology.Master the fundamentals of star schema design and slow change processingIdentify situations that call for multiple stars or cubesEnsure compatibility across subject areas as your data warehouse growsAccommodate repeating attributes, recursive hierarchies, and poor data qualitySupport conflicting requirements for historic dataHandle variation within a business process and correlation of disparate activitiesBoost performance using derived schemas and aggregatesLearn when it's appropriate to adjust designs for BI and ETL tools

OS X 10.10 Yosemite: The Ars Technica Review


John Siracusa - 2014
    Siracusa's overview, wrap-up, and critique of everything new in OS X 10.10 Yosemite.

Our Final Invention: Artificial Intelligence and the End of the Human Era


James Barrat - 2013
    Corporations & government agencies around the world are pouring billions into achieving AI’s Holy Grail—human-level intelligence. Once AI has attained it, scientists argue, it will have survival drives much like our own. We may be forced to compete with a rival more cunning, more powerful & more alien than we can imagine. Thru profiles of tech visionaries, industry watchdogs & groundbreaking AI systems, James Barrat's Our Final Invention explores the perils of the heedless pursuit of advanced AI. Until now, human intelligence has had no rival. Can we coexist with beings whose intelligence dwarfs our own? Will they allow us to?

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data


Hadley Wickham - 2016
    This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way. You’ll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results