Hadoop in Action


Chuck Lam - 2010
    The intended readers are programmers, architects, and project managers who have to process large amounts of data offline. Hadoop in Action will lead the reader from obtaining a copy of Hadoop to setting it up in a cluster and writing data analytic programs.The book begins by making the basic idea of Hadoop and MapReduce easier to grasp by applying the default Hadoop installation to a few easy-to-follow tasks, such as analyzing changes in word frequency across a body of documents. The book continues through the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action.Hadoop in Action will explain how to use Hadoop and present design patterns and practices of programming MapReduce. MapReduce is a complex idea both conceptually and in its implementation, and Hadoop users are challenged to learn all the knobs and levers for running Hadoop. This book takes you beyond the mechanics of running Hadoop, teaching you to write meaningful programs in a MapReduce framework.This book assumes the reader will have a basic familiarity with Java, as most code examples will be written in Java. Familiarity with basic statistical concepts (e.g. histogram, correlation) will help the reader appreciate the more advanced data processing examples. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

The Quants: How a New Breed of Math Whizzes Conquered Wall Street and Nearly Destroyed It


Scott Patterson - 2010
     They were preparing to compete in a poker tournament with million-dollar stakes, but those numbers meant nothing to them.  They were accustomed to risking billions.     At the card table that night was Peter Muller, an eccentric, whip-smart whiz kid who’d studied theoretical mathematics at Princeton and now managed a fabulously successful hedge fund called PDT…when he wasn’t playing his keyboard for morning commuters on the New York subway.  With him was Ken Griffin, who as an undergraduate trading convertible bonds out of his Harvard dorm room had outsmarted the Wall Street pros and made money in one of the worst bear markets of all time.  Now he was the tough-as-nails head of Citadel Investment Group, one of the most powerful money machines on earth. There too were Cliff Asness, the sharp-tongued, mercurial founder of the hedge fund AQR, a man as famous for his computer-smashing rages as for his brilliance, and Boaz Weinstein, chess life-master and king of the credit default swap, who while juggling $30 billion worth of positions for Deutsche Bank found time for frequent visits to Las Vegas with the famed MIT card-counting team.     On that night in 2006, these four men and their cohorts were the new kings of Wall Street.  Muller, Griffin, Asness, and Weinstein were among the best and brightest of a  new breed, the quants.  Over the prior twenty years, this species of math whiz --technocrats who make billions not with gut calls or fundamental analysis but with formulas and high-speed computers-- had usurped the testosterone-fueled, kill-or-be-killed risk-takers who’d long been the alpha males the world’s largest casino.  The quants believed that a dizzying, indecipherable-to-mere-mortals cocktail of differential calculus, quantum physics, and advanced geometry held the key to reaping riches from the financial markets.  And they helped create a digitized money-trading machine that could shift billions around the globe with the click of a mouse.     Few realized that night, though, that in creating this unprecedented machine, men like Muller, Griffin, Asness and Weinstein had sowed the seeds for history’s greatest financial disaster.     Drawing on unprecedented access to these four number-crunching titans, The Quants tells the inside story of what they thought and felt in the days and weeks when they helplessly watched much of their net worth vaporize – and wondered just how their mind-bending formulas and genius-level IQ’s had led them so wrong, so fast.  Had their years of success been dumb luck, fool’s gold, a good run that could come to an end on any given day?  What if The Truth they sought -- the secret of the markets -- wasn’t knowable? Worse, what if there wasn’t any Truth?   In The Quants, Scott Patterson tells the story not just of these men, but of Jim Simons, the reclusive founder of the most successful hedge fund in history; Aaron Brown, the quant who used his math skills to humiliate Wall Street’s old guard at their trademark game of Liar’s Poker, and years later found himself with a front-row seat to the rapid emergence of mortgage-backed securities; and gadflies and dissenters such as Paul Wilmott, Nassim Taleb, and Benoit Mandelbrot.     With the immediacy of today’s NASDAQ close and the timeless power of a Greek tragedy, The Quants is at once a masterpiece of explanatory journalism, a gripping tale of ambition and hubris…and an ominous warning about Wall Street’s future.

Super Crunchers: Why Thinking-By-Numbers Is the New Way to Be Smart


Ian Ayres - 2007
    In this lively and groundbreaking new book, economist Ian Ayres shows how today's best and brightest organizations are analyzing massive databases at lightening speed to provide greater insights into human behavior. They are the Super Crunchers. From internet sites like Google and Amazon that know your tastes better than you do, to a physician's diagnosis and your child's education, to boardrooms and government agencies, this new breed of decision makers are calling the shots. And they are delivering staggeringly accurate results. How can a football coach evaluate a player without ever seeing him play? Want to know whether the price of an airline ticket will go up or down before you buy? How can a formula outpredict wine experts in determining the best vintages? Super crunchers have the answers. In this brave new world of equation versus expertise, Ayres shows us the benefits and risks, who loses and who wins, and how super crunching can be used to help, not manipulate us.Gone are the days of solely relying on intuition to make decisions. No businessperson, consumer, or student who wants to stay ahead of the curve should make another keystroke without reading Super Crunchers.

The Fourth Paradigm: Data-Intensive Scientific Discovery


Tony Hey - 2009
    Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.

Statistics Done Wrong: The Woefully Complete Guide


Alex Reinhart - 2013
    Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.

NSHipster: Obscure Topics in Cocoa & Objective C


Mattt Thompson - 2013
    In cultivating a deep understanding and appreciation of Objective-C, its frameworks and ecosystem, one is able to create apps that delight and inspire users. Combining articles from NSHipster.com with new essays, this book is the essential guide for modern iOS and Mac OS X developers.

Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die


Eric Siegel - 2013
    Rather than a "how to" for hands-on techies, the book entices lay-readers and experts alike by covering new case studies and the latest state-of-the-art techniques.You have been predicted — by companies, governments, law enforcement, hospitals, and universities. Their computers say, "I knew you were going to do that!" These institutions are seizing upon the power to predict whether you're going to click, buy, lie, or die.Why? For good reason: predicting human behavior combats financial risk, fortifies healthcare, conquers spam, toughens crime fighting, and boosts sales.How? Prediction is powered by the world's most potent, booming unnatural resource: data. Accumulated in large part as the by-product of routine tasks, data is the unsalted, flavorless residue deposited en masse as organizations churn away. Surprise! This heap of refuse is a gold mine. Big data embodies an extraordinary wealth of experience from which to learn.Predictive analytics unleashes the power of data. With this technology, the computer literally learns from data how to predict the future behavior of individuals. Perfect prediction is not possible, but putting odds on the future — lifting a bit of the fog off our hazy view of tomorrow — means pay dirt.In this rich, entertaining primer, former Columbia University professor and Predictive Analytics World founder Eric Siegel reveals the power and perils of prediction: -What type of mortgage risk Chase Bank predicted before the recession. -Predicting which people will drop out of school, cancel a subscription, or get divorced before they are even aware of it themselves. -Why early retirement decreases life expectancy and vegetarians miss fewer flights. -Five reasons why organizations predict death, including one health insurance company. -How U.S. Bank, European wireless carrier Telenor, and Obama's 2012 campaign calculated the way to most strongly influence each individual. -How IBM's Watson computer used predictive modeling to answer questions and beat the human champs on TV's Jeopardy! -How companies ascertain untold, private truths — how Target figures out you're pregnant and Hewlett-Packard deduces you're about to quit your job. -How judges and parole boards rely on crime-predicting computers to decide who stays in prison and who goes free. -What's predicted by the BBC, Citibank, ConEd, Facebook, Ford, Google, IBM, the IRS, Match.com, MTV, Netflix, Pandora, PayPal, Pfizer, and Wikipedia. A truly omnipresent science, predictive analytics affects everyone, every day. Although largely unseen, it drives millions of decisions, determining whom to call, mail, investigate, incarcerate, set up on a date, or medicate.Predictive analytics transcends human perception. This book's final chapter answers the riddle: What often happens to you that cannot be witnessed, and that you can't even be sure has happened afterward — but that can be predicted in advance?Whether you are a consumer of it — or consumed by it — get a handle on the power of Predictive Analytics.

Building Machine Learning Systems with Python


Willi Richert - 2013
    

The Anatomy of the Swipe: Making Money Move


Ahmed Siddiqui - 2020
    Yet, not many people understand how payment systems in the US work. Those that do "get it" are unlocking multi-billion dollar opportunities. If you've ever wondered what happens when you actually swipe/dip/tap your credit card or debit card then The Anatomy of the Swipe breaks down the details in the simplest manner possible. Here are some questions answered within these pages:How does money move from my credit card to my favorite coffee shop? How can I build a neo-bank? How can I build my own debit or credit card? How can I accept card based payments? The Anatomy of the Swipe speaks to software developers and entrepreneurs who are looking at implementing card-based payments for the first time, merchants who want to be able to accept payments for a website or store, or those who want to issue their own debit/credit card. This book walks beginners through modern innovations created because of card-based payments, as well as the motivations and revenue models of each party in the payments ecosystem.

Data Structures Using C++


D.S. Malik - 2003
    D.S. Malik is ideal for a one-semester course focused on data structures. Clearly written with the student in mind, this text focuses on Data Structures and includes advanced topics in C++ such as Linked Lists and the Standard Template Library (STL). This student-friendly text features abundant Programming Examples and extensive use of visual diagrams to reinforce difficult topics. Students will find Dr. Malik's use of complete programming code and clear display of syntax, explanation, and example easy to read and conducive to learning.

Competing on Analytics: The New Science of Winning


Thomas H. Davenport - 2007
    But are you using it to “out-think” your rivals? If not, you may be missing out on a potent competitive tool.In Competing on Analytics: The New Science of Winning, Thomas H. Davenport and Jeanne G. Harris argue that the frontier for using data to make decisions has shifted dramatically. Certain high-performing enterprises are now building their competitive strategies around data-driven insights that in turn generate impressive business results. Their secret weapon? Analytics: sophisticated quantitative and statistical analysis and predictive modeling.Exemplars of analytics are using new tools to identify their most profitable customers and offer them the right price, to accelerate product innovation, to optimize supply chains, and to identify the true drivers of financial performance. A wealth of examples—from organizations as diverse as Amazon, Barclay’s, Capital One, Harrah’s, Procter & Gamble, Wachovia, and the Boston Red Sox—illuminate how to leverage the power of analytics.

Python Machine Learning


Sebastian Raschka - 2015
    We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

Laravel: Code Bright


Dayle Rees - 2013
    At $29 and cheaper than a good pizza, you will get the book in its current partial form, along with all future chapters, updates, and fixes for free. As of the day I wrote this description, Code Bright had 130 pages and was just getting started. To give you some perspective on how detailed it is, Code Happy was 127 pages in its complete state. Want to know more? Carry on reading.Welcome back to Laravel. Last year I wrote a book about the Laravel PHP framework. It started as a collection of tutorials on my blog, and eventually became a full book. I definitely didn’t expect it to be as popular as it was. Code Happy has sold almost 3000 copies, and is considered to be one of the most valuable resourcesfor learning the Laravel framework.Code Bright is the spiritual successor to Code Happy. The framework has grown a lot in the past year, and has changed enough to merit a new title. With Code Bright I hope to improve on Code Happy with every way, my goal is, to once again, build the most comprehensive learning experience for the framework. Oh, and to still be funny. That’s very important to me.Laravel Code Bright will contain a complete learning experience for all of the framework’s features. The style of writing will make it approachable for beginners, and a wonderful reference resource for experienced developers alike.You see, people have told me that they enjoyed reading Code Happy, not only for its educational content, but for its humour, and for my down to earth writing style. This is very important to me. I like to write my books as if we were having a conversation in a bar.When I wrote Code Happy last year, I was simply a framework enthusiast. One of the first to share information about the framework. However, since then I have become a committed member of the core development team. Working directly with the framework author to make Laravel a wonderful experience for the developers of the world.One other important feature of both books, is that they are published while in progress. This means that the book is available in an incomplete state, but will grow over time into a complete title. All future updates will be provided for free.What this means is that I don’t have to worry about deadlines, or a fixed point of completion. It leads to less stress and better writing. If I think of a better way to explain something, I can go back and change it. In a sense, the book will never be completed. I can constantly add more information to it, until it becomes the perfect resource.Given that this time I am using the majority of my spare time to write the title (yes, I have a full time job too!), I have raised the price a little to justify my invested time. I was told by many of my past readers that they found the previous title very cheap for the resource that it grew into, so if you are worried about the new price, then let me remind you what you will get for your 29 bucks.The successor to Code Happy, seen by many as the #1 learning resource for the Laravel PHP framework.An unending source of information, chapters will be constantly added as needed until the book becomes a giant vault of framework knowledge.Comedy, and a little cheesy, but very friendly writing.

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference


Cameron Davidson-Pilon - 2014
    However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.

Automate This: How Algorithms Came to Rule Our World


Christopher Steiner - 2012
    It used to be that to diagnose an illness, interpret legal documents, analyze foreign policy, or write a newspaper article you needed a human being with specific skills—and maybe an advanced degree or two. These days, high-level tasks are increasingly being handled by algorithms that can do precise work not only with speed but also with nuance. These “bots” started with human programming and logic, but now their reach extends beyond what their creators ever expected. In this fascinating, frightening book, Christopher Steiner tells the story of how algorithms took over—and shows why the “bot revolution” is about to spill into every aspect of our lives, often silently, without our knowledge. The May 2010 “Flash Crash” exposed Wall Street’s reliance on trading bots to the tune of a 998-point market drop and $1 trillion in vanished market value. But that was just the beginning. In Automate This, we meet bots that are driving cars, penning haiku, and writing music mistaken for Bach’s. They listen in on our customer service calls and figure out what Iran would do in the event of a nuclear standoff. There are algorithms that can pick out the most cohesive crew of astronauts for a space mission or identify the next Jeremy Lin. Some can even ingest statistics from baseball games and spit out pitch-perfect sports journalism indistinguishable from that produced by humans. The interaction of man and machine can make our lives easier. But what will the world look like when algorithms control our hospitals, our roads, our culture, and our national security? What hap­pens to businesses when we automate judgment and eliminate human instinct? And what role will be left for doctors, lawyers, writers, truck drivers, and many others?  Who knows—maybe there’s a bot learning to do your job this minute.