Small Data: The Tiny Clues that Uncover Huge Trends


Martin Lindstrom - 2016
    You’ll learn…• How a noise reduction headset at 35,000 feet led to the creation of Pepsi’s new trademarked signature sound.• How a worn down sneaker discovered in the home of an 11-year-old German boy led to LEGO’s incredible turnaround.• How a magnet found on a fridge in Siberia resulted in a U.S. supermarket revolution.• How a toy stuffed bear in a girl’s bedroom helped revolutionize a fashion retailer’s 1,000 stores in 20 different countries.• How an ordinary bracelet helped Jenny Craig increase customer loyalty by 159% in less than a year.• How the ergonomic layout of a car dashboard led to the redesign of the Roomba vacuum.

How to Lie with Statistics


Darrell Huff - 1954
    Darrell Huff runs the gamut of every popularly used type of statistic, probes such things as the sample study, the tabulation method, the interview technique, or the way the results are derived from the figures, and points up the countless number of dodges which are used to fool rather than to inform.

Practical Statistics for Data Scientists: 50 Essential Concepts


Peter Bruce - 2017
    Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.With this book, you'll learn:Why exploratory data analysis is a key preliminary step in data scienceHow random sampling can reduce bias and yield a higher quality dataset, even with big dataHow the principles of experimental design yield definitive answers to questionsHow to use regression to estimate outcomes and detect anomaliesKey classification techniques for predicting which categories a record belongs toStatistical machine learning methods that "learn" from dataUnsupervised learning methods for extracting meaning from unlabeled data

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy


George Gilder - 2018
    Gilder says or writes is ever delivered at anything less than the fullest philosophical decibel... Mr. Gilder sounds less like a tech guru than a poet, and his words tumble out in a romantic cascade." “Google’s algorithms assume the world’s future is nothing more than the next moment in a random process. George Gilder shows how deep this assumption goes, what motivates people to make it, and why it’s wrong: the future depends on human action.” — Peter Thiel, founder of PayPal and Palantir Technologies and author of Zero to One: Notes on Startups, or How to Build the Future The Age of Google, built on big data and machine intelligence, has been an awesome era. But it’s coming to an end. In Life after Google, George Gilder—the peerless visionary of technology and culture—explains why Silicon Valley is suffering a nervous breakdown and what to expect as the post-Google age dawns. Google’s astonishing ability to “search and sort” attracts the entire world to its search engine and countless other goodies—videos, maps, email, calendars….And everything it offers is free, or so it seems. Instead of paying directly, users submit to advertising. The system of “aggregate and advertise” works—for a while—if you control an empire of data centers, but a market without prices strangles entrepreneurship and turns the Internet into a wasteland of ads. The crisis is not just economic. Even as advances in artificial intelligence induce delusions of omnipotence and transcendence, Silicon Valley has pretty much given up on security. The Internet firewalls supposedly protecting all those passwords and personal information have proved hopelessly permeable. The crisis cannot be solved within the current computer and network architecture. The future lies with the “cryptocosm”—the new architecture of the blockchain and its derivatives. Enabling cryptocurrencies such as bitcoin and ether, NEO and Hashgraph, it will provide the Internet a secure global payments system, ending the aggregate-and-advertise Age of Google. Silicon Valley, long dominated by a few giants, faces a “great unbundling,” which will disperse computer power and commerce and transform the economy and the Internet. Life after Google is almost here.   For fans of "Wealth and Poverty," "Knowledge and Power," and "The Scandal of Money."

Learning OpenCV: Computer Vision with the OpenCV Library


Gary Bradski - 2008
    Freeman, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of TechnologyLearning OpenCV puts you in the middle of the rapidly expanding field of computer vision. Written by the creators of the free open source OpenCV library, this book introduces you to computer vision and demonstrates how you can quickly build applications that enable computers to "see" and make decisions based on that data. Computer vision is everywhere-in security systems, manufacturing inspection systems, medical image analysis, Unmanned Aerial Vehicles, and more. It stitches Google maps and Google Earth together, checks the pixels on LCD screens, and makes sure the stitches in your shirt are sewn properly. OpenCV provides an easy-to-use computer vision framework and a comprehensive library with more than 500 functions that can run vision code in real time.Learning OpenCV will teach any developer or hobbyist to use the framework quickly with the help of hands-on exercises in each chapter. This book includes:A thorough introduction to OpenCV Getting input from cameras Transforming images Segmenting images and shape matching Pattern recognition, including face detection Tracking and motion in 2 and 3 dimensions 3D reconstruction from stereo vision Machine learning algorithms Getting machines to see is a challenging but entertaining goal. Whether you want to build simple or sophisticated vision applications, Learning OpenCV is the book you need to get started.

Data-ism: The Revolution Transforming Decision Making, Consumer Behavior, and Almost Everything Else


Steve Lohr - 2015
    Today, Data is the vital raw material of the information economy. The explosive abundance of this digital asset, more than doubling every two years, is creating a new world of opportunity and challenge.Data-ism is about this next phase, in which vast, Internet-scale data sets are used for discovery and prediction in virtually every field. It is a journey across this emerging world with people, illuminating narrative examples, and insights. It shows that, if exploited, this new revolution will change the way decisions are made—relying more on data and analysis, and less on intuition and experience—and transform the nature of leadership and management.Lohr explains how individuals and institutions will need to exploit, protect, and manage their data to stay competitive in the coming years. Filled with rich examples and anecdotes of the various ways in which the rise of Big Data is affecting everyday life it raises provocative questions about policy and practice that have wide implications for all of our lives.

Calling Bullshit: The Art of Skepticism in a Data-Driven World


Carl T. Bergstrom - 2020
    Now, two science professors give us the tools to dismantle misinformation and think clearly in a world of fake news and bad data.It's increasingly difficult to know what's true. Misinformation, disinformation, and fake news abound. Our media environment has become hyperpartisan. Science is conducted by press release. Startup culture elevates bullshit to high art. We are fairly well equipped to spot the sort of old-school bullshit that is based in fancy rhetoric and weasel words, but most of us don't feel qualified to challenge the avalanche of new-school bullshit presented in the language of math, science, or statistics. In Calling Bullshit, Professors Carl Bergstrom and Jevin West give us a set of powerful tools to cut through the most intimidating data.You don't need a lot of technical expertise to call out problems with data. Are the numbers or results too good or too dramatic to be true? Is the claim comparing like with like? Is it confirming your personal bias? Drawing on a deep well of expertise in statistics and computational biology, Bergstrom and West exuberantly unpack examples of selection bias and muddled data visualization, distinguish between correlation and causation, and examine the susceptibility of science to modern bullshit.We have always needed people who call bullshit when necessary, whether within a circle of friends, a community of scholars, or the citizenry of a nation. Now that bullshit has evolved, we need to relearn the art of skepticism.

Python Machine Learning


Sebastian Raschka - 2015
    We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

alchemy of Money: THINK RICH INITIATIVES


Anand S - 2016
    It is important for every person to save for one’s retirement as one can expect to live for twenty years after one retires as life expectancy of an Indian is going up steadily due to lower infant mortality and better medical care. There is a complete absence of social security safety net for most Indians today, even for those working in Government sector, there is no inflation adjusted pension available anymore. I have tried to simplify the advantages and disadvantages involved in investing your savings in various asset classes. I have deliberately left out two of the most popular forms of investment among middle class Indians 1) Life insurance 2) Real estate Let us consider life insurance first most of us confuse insurance as an instrument of savings, it is not. We have this wrong view because of the tax breaks given to income tax assesses by the Central Government. Insurance is a product that mitigates risk and is sold by the rich to the middle class and is always skewed in the favour of the insurer rather than the insured. A substantial portion of the total money invested by you goes towards paying agent’s commission and premium for insuring you for the risk of mortality. The balance left out is invested in government securities and other securities. Hence the amount of money invested out of the total premium paid is less than half paid by the insurer. The return on money invested by the policy holder is less than half of the money he would have earned either in bonds or fixed deposits. A person who needs insurance is a person whose family will need support in the event of his untimely death. Alternately insurance is required for a person who has debt in form of mortgage and does not want to burden his family in the event of his passing. The product which covers these risks is called term insurance. One should not buy insurance to avoid taxes as there is better tax saving tools available. Real estate is also considered as a good investment by several retail investors but nothing can be further from the truth. Nobody makes money by buying plots in the middle of nowhere. The easy availability of mortgages from the nineties and the tax breaks given by the Central Government on housing loans has created an unparalleled boom in the residential market. There is now a painful correction process under way in that sector. The price of land is reflexively connected to availability of money. The lower the cost of money, greater the returns in real estate. Buying plots in the middle of nowhere is similar to buying lottery tickets as investment. Land cannot be liquidated immediately into cash at a short notice to meet urgent requirements. Cost of maintenance and protection of real estate from illegal occupation is prohibitive and time consuming. Verification of title deeds to the property is a complex process and needs sound legal advice. You should have a house to live and another to collect rent as rent is equivalent of inflation adjusted pension. The return on investment generated in the three different asset classes over 25 years would be in the following order 1) Equities 2) Gold and finally 3) Debt instruments. I enjoyed writing this book as a companion volume to my first book. It is my fond hope that you enjoy reading this book.

The Spatial Web: How Web 3.0 Will Connect Humans, Machines, and AI to Transform the World


Gabriel Rene - 2019
    Blade Runner, The Matrix, Star Wars, Avatar, Star Trek, Ready Player One and Avengers show us futuristic worlds where holograms, intelligent robots, smart devices, virtual avatars, digital transactions, and universe-scale teleportation work together perfectly, somehow seamlessly combining the virtual and the physical with the mechanical and the biological. Science fiction has done an excellent job describing a vision of the future where the digital and physical merge naturally into one — in a way that just works everywhere, for everyone. However, none of these visionary fictional works go so far as to describe exactly how this would actually be accomplished. While it has inspired many of us to ask the question—How do we enable science fantasy to become....science fact? The Spatial Web achieves this by first describing how exponentially powerful computing technologies are creating a great “Convergence.” How Augmented and Virtual Reality will enable us to overlay our information and imaginations onto the world. How Artificial Intelligence will infuse the environments and objects around us with adaptive intelligence. How the Internet of Things and Robotics will enable our vehicles, appliances, clothing, furniture, and homes to become connected and embodied with the power to see, feel, hear, smell, touch and move things in the world, and how Blockchain and Cryptocurrencies will secure our data and enable real-time transactions between the human, machine and virtual economies of the future. The book then dives deeply into the challenges and shortcomings of the World Wide Web, the rise of fake news and surveillance capitalism in Web 2.0 and the risk of algorithmic terrorism and biological hacking and “fake-reality” in Web 3.0. It raises concerns about the threat that emerging technologies pose in the hands of rogue actors whether human, algorithmic, corporate or state-sponsored and calls for common sense governance and global cooperation. It calls for business leaders, organizations and governments to not only support interoperable standards for software code, but critically, for ethical, and social codes as well. Authors Gabriel René and Dan Mapes describe in vivid detail how a new “spatial” protocol is required in order to connect the various exponential technologies of the 21st century into an integrated network capable of tracking and managing the real-time activities of our cities, monitoring and adjusting the supply chains that feed them, optimizing our farms and natural resources, automating our manufacturing and distribution, transforming marketing and commerce, accelerating our global economies, running advanced planet-scale simulations and predictions, and even bridging the gap between our interior individual reality and our exterior collective one. Enabling the ability for humans, machines and AI to communicate, collaborate and coordinate activities in the world at a global scale and how the thoughtful application of these technologies could lead to an unprecedented opportunity to create a truly global “networked” civilization or "Smart World.” The book artfully shifts between cyberpunk futurism, cautionary tale-telling, and life-affirming call-to-arms. It challenges us to consider the importance of today’s technological choices as individuals, organizations, and as a species, as we face the historic opportunity we have to transform the web, the world, and our very definition of reality.

Understanding Variation: The Key to Managing Chaos


Donald J. Wheeler - 1993
    But before numerical information can be useful it must be analyzed, interpreted, and assimilated. Unfortunately, teaching the techniques for making sense of data has been neglected at all levels of our educational system. As a result, through our culture there is little appreciation of how to effectively use the volumes of data generated by both business and government. This book can remedy that situation. Readers report that this book as changed both the way they look a data and the very form their monthly reports. It has turned arguments about the numbers into a common understanding of what needs to be done about them. These techniques and benefits have been thoroughly proven in a wide variety of settings. Read this book and use the techniques to gain the benefits for your company.

Reinventing Capitalism in the Age of Big Data


Viktor Mayer-Schönberger - 2018
    That's all going to change thanks to the Big Data revolution. As Viktor Mayer-Schörger, bestselling author of Big Data, and Thomas Ramge, who writes for The Economist, show, data is replacing money as the driver of market behavior. Big finance and big companies will be replaced by small groups and individual actors who make markets instead of making things: think Uber instead of Ford, or Airbnb instead of Hyatt. This is the dawn of the era of data capitalism. Will it be an age of prosperity or of calamity? This book provides the indispensable roadmap for securing a better future.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

How Charts Lie: Getting Smarter about Visual Information


Alberto Cairo - 2019
    While such visualizations can better inform us, they can also deceive by displaying incomplete or inaccurate data, suggesting misleading patterns—or simply misinform us by being poorly designed, such as the confusing “eye of the storm” maps shown on TV every hurricane season.Many of us are ill equipped to interpret the visuals that politicians, journalists, advertisers, and even employers present each day, enabling bad actors to easily manipulate visuals to promote their own agendas. Public conversations are increasingly driven by numbers, and to make sense of them we must be able to decode and use visual information. By examining contemporary examples ranging from election-result infographics to global GDP maps and box-office record charts, How Charts Lie teaches us how to do just that.

Python Data Science Handbook: Tools and Techniques for Developers


Jake Vanderplas - 2016
    Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms