Hadoop in Action


Chuck Lam - 2010
    The intended readers are programmers, architects, and project managers who have to process large amounts of data offline. Hadoop in Action will lead the reader from obtaining a copy of Hadoop to setting it up in a cluster and writing data analytic programs.The book begins by making the basic idea of Hadoop and MapReduce easier to grasp by applying the default Hadoop installation to a few easy-to-follow tasks, such as analyzing changes in word frequency across a body of documents. The book continues through the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action.Hadoop in Action will explain how to use Hadoop and present design patterns and practices of programming MapReduce. MapReduce is a complex idea both conceptually and in its implementation, and Hadoop users are challenged to learn all the knobs and levers for running Hadoop. This book takes you beyond the mechanics of running Hadoop, teaching you to write meaningful programs in a MapReduce framework.This book assumes the reader will have a basic familiarity with Java, as most code examples will be written in Java. Familiarity with basic statistical concepts (e.g. histogram, correlation) will help the reader appreciate the more advanced data processing examples. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Machine Learning in Action


Peter Harrington - 2011
    "Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.

Pattern Classification


David G. Stork - 1973
    Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics.An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.

10 PRINT CHR$(205.5+RND(1)); : GOTO 10


Nick MontfortMark Sample - 2012
    The authors of this collaboratively written book treat code not as merely functional but as a text—in the case of 10 PRINT, a text that appeared in many different printed sources—that yields a story about its making, its purpose, its assumptions, and more. They consider randomness and regularity in computing and art, the maze in culture, the popular BASIC programming language, and the highly influential Commodore 64 computer.

Reinforcement Learning: An Introduction


Richard S. Sutton - 1998
    Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

Writing for Computer Science


Justin Zobel - 1997
    For the most part the book is a discussion of good writing style and effective research strategies. Some of the material is accepted wisdom, some is controversial, and some is my opinions. Although the book is brief, it is designed to be comprehensive: some readers may be interested in exploring topics further, but for most readers this book should be suf?cient. The ?rst edition of this book was almost entirely about writing. This e- tion, partly in response to reader feedback and partly in response to issues that arose in my ownexperiences as an advisor, researcher, and referee, is also about research methods. Indeed, the two topics writing about and doing research are not clearly separated. It is a small step from asking how do I write? to askingwhatisitthatIwriteabout? As previously, the guidance on writing focuses on research, but much of the material is applicable to general technical and professional communication. Likewise, the guidance on the practice of research has broader lessons. A pr- titioner trying a new algorithm or explaining to colleagues why one solution is preferable to another should be con?dent that the arguments are built on robust foundations. And, while this edition has a stronger emphasis on research than did the ?rst, nothing has been deleted; there is additional material on research, but the guidance on writing has not been taken away."

Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)


Jiawei Han - 2000
    Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site

Getting Started with OAuth 2.0


Ryan Boyd - 2011
    This concise introduction shows you how OAuth provides a single authorization technology across numerous APIs on the Web, so you can securely access users’ data—such as user profiles, photos, videos, and contact lists—to improve their experience of your application.Through code examples, step-by-step instructions, and use-case examples, you’ll learn how to apply OAuth 2.0 to your server-side web application, client-side app, or mobile app. Find out what it takes to access social graphs, store data in a user’s online filesystem, and perform many other tasks.Understand OAuth 2.0’s role in authentication and authorizationLearn how OAuth’s Authorization Code flow helps you integrate data from different business applicationsDiscover why native mobile apps use OAuth differently than mobile web appsUse OpenID Connect and eliminate the need to build your own authentication system

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die


Eric Siegel - 2013
    Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.

Prediction Machines: The Simple Economics of Artificial Intelligence


Ajay Agrawal - 2018
    But facing the sea change that AI will bring can be paralyzing. How should companies set strategies, governments design policies, and people plan their lives for a world so different from what we know? In the face of such uncertainty, many analysts either cower in fear or predict an impossibly sunny future.But in Prediction Machines, three eminent economists recast the rise of AI as a drop in the cost of prediction. With this single, masterful stroke, they lift the curtain on the AI-is-magic hype and show how basic tools from economics provide clarity about the AI revolution and a basis for action by CEOs, managers, policy makers, investors, and entrepreneurs.When AI is framed as cheap prediction, its extraordinary potential becomes clear: Prediction is at the heart of making decisions under uncertainty. Our businesses and personal lives are riddled with such decisions. Prediction tools increase productivity--operating machines, handling documents, communicating with customers. Uncertainty constrains strategy. Better prediction creates opportunities for new business structures and strategies to compete. Penetrating, fun, and always insightful and practical, Prediction Machines follows its inescapable logic to explain how to navigate the changes on the horizon. The impact of AI will be profound, but the economic framework for understanding it is surprisingly simple.

Darwin Among The Machines: The Evolution Of Global Intelligence


George Dyson - 1997
    Dyson traces the course of the information revolution, illuminating the lives and work of visionaries - from the time of Thomas Hobbes to the time of John von Neumann - who foresaw the development of artificial intelligence, artificial life, and artificial mind. This book derives both its title and its outlook from Samuel Butler's 1863 essay "Darwin Among the Machines." Observing the beginnings of miniaturization, self-reproduction, and telecommunication among machines, Butler predicted that nature's intelligence, only temporarily subservient to technology, would resurface to claim our creations as her own. Weaving a cohesive narrative among his brilliant predecessors, Dyson constructs a straightforward, convincing, and occasionally frightening view of the evolution of mind in the global network, on a level transcending our own. Dyson concludes that we are in the midst of an experiment that echoes the prehistory of human intelligence and the origins of life. Just as the exchange of coded molecular instructions brought life as we know it to the early earth's primordial soup, and as language and mind combined to form the culture in which we live, so, in the digital universe, are computer programs and worldwide networks combining to produce an evolutionary theater in which the distinctions between nature and technology are increasingly obscured. Nature, believes Dyson, is on the side of the machines.

The Mikado Method


Ola Ellnestam - 2014
    The Mikado Method is a process for surfacing the dependencies in a codebase, so that you can systematically eliminate technical debt and get things done.It gets its name from a simple game commonly known as "pick-up sticks." You start with a jumbled pile of sticks. The goal is to remove the Mikado, or Emperor, stick without disturbing the others. Players carefully remove sticks one at a time, leaving the rest of the heap intact, slowly exposing the Mikado. The game is a great metaphor for eliminating technical debt—carefully extracting each intertwined dependency until you're able to successfully resolve the central issue and move on.The Mikado Method is a book by the creators of this process. It describes a pragmatic, straightforward, and empirical method to plan and perform non-trivial technical improvements on an existing software system. The method has simple rules, but the applicability is vast. As you read, you'll practice a step-by-step system for identifying the scope and nature of your technical debt, mapping the key dependencies, and determining the safest way to approach the "Mikado"-your goal. A natural byproduct of this process is the Mikado Graph, a minimalistic, relevant, just-in-time roadmap and information radiator that reflects deep understanding of how your system works.

Genetic Algorithms in Search, Optimization, and Machine Learning


David Edward Goldberg - 1989
    Major concepts are illustrated with running examples, and major algorithms are illustrated by Pascal computer programs. No prior knowledge of GAs or genetics is assumed, and only a minimum of computer programming and mathematics background is required. 0201157675B07092001

The Problem with Software: Why Smart Engineers Write Bad Code


Adam Barr - 2018
    As the size and complexity of commercial software have grown, the gap between academic computer science and industry has widened. It's an open secret that there is little engineering in software engineering, which continues to rely not on codified scientific knowledge but on intuition and experience.Barr, who worked as a programmer for more than twenty years, describes how the industry has evolved, from the era of mainframes and Fortran to today's embrace of the cloud. He explains bugs and why software has so many of them, and why today's interconnected computers offer fertile ground for viruses and worms. The difference between good and bad software can be a single line of code, and Barr includes code to illustrate the consequences of seemingly inconsequential choices by programmers. Looking to the future, Barr writes that the best prospect for improving software engineering is the move to the cloud. When software is a service and not a product, companies will have more incentive to make it good rather than "good enough to ship."

Big Data: A Revolution That Will Transform How We Live, Work, and Think


Viktor Mayer-Schönberger - 2013
    “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com