Big Data Now: Current Perspectives from O'Reilly Radar


O'Reilly Radar Team - 2011
    Mike Loukides kicked things off in June 2010 with “What is data science?” and from there we’ve pursued the various threads and themes that naturally emerged. Now, roughly a year later, we can look back over all we’ve covered and identify a number of core data areas: Data issues -- The opportunities and ambiguities of the data space are evident in discussions around privacy, the implications of data-centric industries, and the debate about the phrase “data science” itself. The application of data: products and processes – A “data product” can emerge from virtually any domain, including everything from data startups to established enterprises to media/journalism to education and research. Data science and data tools -- The tools and technologies that drive data science are of course essential to this space, but the varied techniques being applied are also key to understanding the big data arena.The business of data – Take a closer look at the actions connected to data -- the finding, organizing, and analyzing that provide organizations of all sizes with the information they need to compete.

Artificial Intelligence: A Modern Approach


Stuart Russell - 1994
    The long-anticipated revision of this best-selling text offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence. *NEW-Nontechnical learning material-Accompanies each part of the book. *NEW-The Internet as a sample application for intelligent systems-Added in several places including logical agents, planning, and natural language. *NEW-Increased coverage of material - Includes expanded coverage of: default reasoning and truth maintenance systems, including multi-agent/distributed AI and game theory; probabilistic approaches to learning including EM; more detailed descriptions of probabilistic inference algorithms. *NEW-Updated and expanded exercises-75% of the exercises are revised, with 100 new exercises. *NEW-On-line Java software. *Makes it easy for students to do projects on the web using intelligent agents. *A unified, agent-based approach to AI-Organizes the material around the task of building intelligent agents. *Comprehensive, up-to-date coverage-Includes a unified view of the field organized around the rational decision making pa

Numsense! Data Science for the Layman: No Math Added


Annalyn Ng - 2017
    Sold in over 85 countries and translated into more than 5 languages.---------------Want to get started on data science?Our promise: no math added.This book has been written in layman's terms as a gentle introduction to data science and its algorithms. Each algorithm has its own dedicated chapter that explains how it works, and shows an example of a real-world application. To help you grasp key concepts, we stick to intuitive explanations and visuals.Popular concepts covered include:- A/B Testing- Anomaly Detection- Association Rules- Clustering- Decision Trees and Random Forests- Regression Analysis- Social Network Analysis- Neural NetworksFeatures:- Intuitive explanations and visuals- Real-world applications to illustrate each algorithm- Point summaries at the end of each chapter- Reference sheets comparing the pros and cons of algorithms- Glossary list of commonly-used termsWith this book, we hope to give you a practical understanding of data science, so that you, too, can leverage its strengths in making better decisions.

Doing Data Science


Cathy O'Neil - 2013
    But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

Calculus Made Easy


Silvanus Phillips Thompson - 1910
    With a new introduction, three new chapters, modernized language and methods throughout, and an appendix of challenging and enjoyable practice problems, Calculus Made Easy has been thoroughly updated for the modern reader.

Universal Principles of Design: 100 Ways to Enhance Usability, Influence Perception, Increase Appeal, Make Better Design Decisions, and Teach Through Design


William Lidwell - 2003
    Because no one can be an expert on everything, designers have always had to scramble to find the information and know-how required to make a design work - until now. Universal Principles of Design is the first cross-disciplinary reference of design. Richly illustrated and easy to navigate, this book pairs clear explanations of the design concepts featured with visual examples of those concepts applied in practice. From the 80/20 rule to chunking, from baby-face bias to Ockham's razor, and from self-similarity to storytelling, 100 design concepts are defined and illustrated for readers to expand their knowledge.This landmark reference will become the standard for designers, engineers, architects, and students who seek to broaden and improve their design expertise.

Deep Learning


Ian Goodfellow - 2016
    Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

Hiding from the Internet: Eliminating Personal Online Information


Michael Bazzell - 2012
    Author Michael Bazzell has been well known in government circles for his ability to locate personal information about anyone through the internet. In Hiding from the Internet: Eliminating Personal Online Information, he exposes the resources that broadcast your personal details to public view. He has researched each source and identified the best method to have your private details removed from the databases that store profiles on all of us. This book will serve as a reference guide for anyone that values privacy. Each technique is explained in simple steps. It is written in a hands-on style that encourages the reader to execute the tutorials as they go. The author provides personal experiences from his journey to disappear from public view. Much of the content of this book has never been discussed in any publication. Always thinking like a hacker, the author has identified new ways to force companies to remove you from their data collection systems. This book exposes loopholes that create unique opportunities for privacy seekers. Among other techniques, you will learn to: Remove your personal information from public databases and people search sites Create free anonymous mail addresses, email addresses, and telephone numbers Control your privacy settings on social networks and remove sensitive data Provide disinformation to conceal true private details Force data brokers to stop sharing your information with both private and public organizations Prevent marketing companies from monitoring your browsing, searching, and shopping habits Remove your landline and cellular telephone numbers from online websites Use a credit freeze to eliminate the worry of financial identity theft and fraud Change your future habits to promote complete privacy and anonymity Conduct a complete background check to verify proper information removalConfigure a home firewall with VPN Kill-SwitchPurchase a completely invisible home or vehicle

Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

The Systems Thinker: Essential Thinking Skills


Albert Rutherford - 2018
    Gain a deep understanding of the “what, why, how, when, how much” questions of your life. Become a Systems Thinker and discover how to approach your life from a completely new perspective. What is systems thinking? Put it simply, thinking about how things interact with one another. Why should this matter to you? Because you are a system. You are a part of smaller and larger systems – your community, your country, your species. Understanding your role within these systems and how these systems affect, hinder, or aid the fulfillment of your life can lead you to better answers about yourself and the world. Information is the most precious asset these days. Evaluating that information correctly is almost priceless. Systems thinkers are some of the bests in collecting and assessing information, as well as creating impactful solutions in any context. The Systems Thinker will help you to implement systems thinking at your workplace, human relations, and everyday thinking habits. Boost your observation and analytical skills to find the real triggers and influencing forces behind contemporary politics, economics, health, and education changes. Systems thinking clears your vision by teaching you not only to find the differences between the elements but also the similarities. This bi-directional analyzing ability will give you a more complex worldview, deeper understanding of problems, and thus better solutions. The car stopped because its tank is empty – so it needs gas. Easy problem, easy solution, right? But could you explain just as easily why did the price of gas raise with 5% the past month? After becoming a systems thinker, you’ll be able to answer that question just as easily. Change your thoughts, change your results. •What are the main elements, questions and methods of thinking in systems? •The most widely used systems archetypes, maps, models, and analytical methods. •Learn to identify and provide solutions even the most complex system problems. •Deepen your understanding about human motivation with systems thinking. The past fifty years brought so many changes in our lives. The world has become more interconnected than ever. Old rules can’t explain the new world anymore. But systems thinking can. Embrace systems thinking and become a master of analytical, critical, and creative thinking.

Machine Learning in Action


Peter Harrington - 2011
    "Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.

Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine


Clinton Gormley - 2014
    This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with the complexities of human language, geolocation, and relationships.If you're a newcomer to both search and distributed systems, you'll quickly learn how to integrate Elasticsearch into your application. More experienced users will pick up lots of advanced techniques. Throughout the book, you'll follow a problem-based approach to learn why, when, and how to use Elasticsearch features.Understand how Elasticsearch interprets data in your documentsIndex and query your data to take advantage of search concepts such as relevance and word proximityHandle human language through the effective use of analyzers and queriesSummarize and group data to show overall trends, with aggregations and analyticsUse geo-points and geo-shapes--Elasticsearch's approaches to geolocationModel your data to take advantage of Elasticsearch's horizontal scalabilityLearn how to configure and monitor your cluster in production

Data Structures and Algorithms in Python


Michael T. Goodrich - 2012
     Data Structures and Algorithms in Python is the first mainstream object-oriented book available for the Python data structures course. Designed to provide a comprehensive introduction to data structures and algorithms, including their design, analysis, and implementation, the text will maintain the same general structure as Data Structures and Algorithms in Java and Data Structures and Algorithms in C++.

High Performance MySQL: Optimization, Backups, and Replication


Baron Schwartz - 2008
    This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover. Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL. Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and views Implement improvements in replication, high availability, and clustering Achieve high performance when running MySQL in the cloud Optimize advanced querying features, such as full-text searches Take advantage of modern multi-core CPUs and solid-state disks Explore backup and recovery strategies—including new tools for hot online backups

Machine Learning for Hackers


Drew Conway - 2012
    Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyze sample datasets and write simple machine learning algorithms. "Machine Learning for Hackers" is ideal for programmers from any background, including business, government, and academic research.Develop a naive Bayesian classifier to determine if an email is spam, based only on its textUse linear regression to predict the number of page views for the top 1,000 websitesLearn optimization techniques by attempting to break a simple letter cipherCompare and contrast U.S. Senators statistically, based on their voting recordsBuild a "whom to follow" recommendation system from Twitter data