Big Data Now: Current Perspectives from O'Reilly Radar


O'Reilly Radar Team - 2011
    Mike Loukides kicked things off in June 2010 with “What is data science?” and from there we’ve pursued the various threads and themes that naturally emerged. Now, roughly a year later, we can look back over all we’ve covered and identify a number of core data areas: Data issues -- The opportunities and ambiguities of the data space are evident in discussions around privacy, the implications of data-centric industries, and the debate about the phrase “data science” itself. The application of data: products and processes – A “data product” can emerge from virtually any domain, including everything from data startups to established enterprises to media/journalism to education and research. Data science and data tools -- The tools and technologies that drive data science are of course essential to this space, but the varied techniques being applied are also key to understanding the big data arena.The business of data – Take a closer look at the actions connected to data -- the finding, organizing, and analyzing that provide organizations of all sizes with the information they need to compete.

Think Stats


Allen B. Downey - 2011
    This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data

The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers


Ben Horowitz - 2014
    His blog has garnered a devoted following of millions of readers who have come to rely on him to help them run their businesses. A lifelong rap fan, Horowitz amplifies business lessons with lyrics from his favorite songs and tells it straight about everything from firing friends to poaching competitors, from cultivating and sustaining a CEO mentality to knowing the right time to cash in.His advice is grounded in anecdotes from his own hard-earned rise—from cofounding the early cloud service provider Loudcloud to building the phenomenally successful Andreessen Horowitz venture capital firm, both with fellow tech superstar Marc Andreessen (inventor of Mosaic, the Internet's first popular Web browser). This is no polished victory lap; he analyzes issues with no easy answers through his trials, includingdemoting (or firing) a loyal friend;whether you should incorporate titles and promotions, and how to handle them;if it's OK to hire people from your friend's company;how to manage your own psychology, while the whole company is relying on you;what to do when smart people are bad employees;why Andreessen Horowitz prefers founder CEOs, and how to become one;whether you should sell your company, and how to do it.Filled with Horowitz's trademark humor and straight talk, and drawing from his personal and often humbling experiences, The Hard Thing About Hard Things is invaluable for veteran entrepreneurs as well as those aspiring to their own new ventures.

Dark Pools: The Rise of Artificially Intelligent Trading Machines and the Looming Threat to Wall Street


Scott Patterson - 2012
    In the beginning was Josh Levine, an idealistic programming genius who dreamed of wresting control of the market from the big exchanges that, again and again, gave the giant institutions an advantage over the little guy. Levine created a computerized trading hub named Island where small traders swapped stocks, and over time his invention morphed into a global electronic stock market that sent trillions in capital through a vast jungle of fiber-optic cables. By then, the market that Levine had sought to fix had turned upside down, birthing secretive exchanges called dark pools and a new species of trading machines that could think, and that seemed, ominously, to be slipping the control of their human masters. Dark Pools is the fascinating story of how global markets have been hijacked by trading robots--many so self-directed that humans can't predict what they'll do next.

The Lean Startup: How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses


Eric Ries - 2011
    But many of those failures are preventable. The Lean Startup is a new approach being adopted across the globe, changing the way companies are built and new products are launched. Eric Ries defines a startup as an organization dedicated to creating something new under conditions of extreme uncertainty. This is just as true for one person in a garage or a group of seasoned professionals in a Fortune 500 boardroom. What they have in common is a mission to penetrate that fog of uncertainty to discover a successful path to a sustainable business.The Lean Startup approach fosters companies that are both more capital efficient and that leverage human creativity more effectively. Inspired by lessons from lean manufacturing, it relies on "validated learning," rapid scientific experimentation, as well as a number of counter-intuitive practices that shorten product development cycles, measure actual progress without resorting to vanity metrics, and learn what customers really want. It enables a company to shift directions with agility, altering plans inch by inch, minute by minute.Rather than wasting time creating elaborate business plans, The Lean Startup offers entrepreneurs - in companies of all sizes - a way to test their vision continuously, to adapt and adjust before it's too late. Ries provides a scientific approach to creating and managing successful startups in a age when companies need to innovate more than ever.

Automate the Boring Stuff with Python: Practical Programming for Total Beginners


Al Sweigart - 2014
    But what if you could have your computer do them for you?In "Automate the Boring Stuff with Python," you'll learn how to use Python to write programs that do in minutes what would take you hours to do by hand no prior programming experience required. Once you've mastered the basics of programming, you'll create Python programs that effortlessly perform useful and impressive feats of automation to: Search for text in a file or across multiple filesCreate, update, move, and rename files and foldersSearch the Web and download online contentUpdate and format data in Excel spreadsheets of any sizeSplit, merge, watermark, and encrypt PDFsSend reminder emails and text notificationsFill out online formsStep-by-step instructions walk you through each program, and practice projects at the end of each chapter challenge you to improve those programs and use your newfound skills to automate similar tasks.Don't spend your time doing work a well-trained monkey could do. Even if you've never written a line of code, you can make your computer do the grunt work. Learn how in "Automate the Boring Stuff with Python.""

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World


Pedro Domingos - 2015
    In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.

Predictive Analytics for Dummies


Anasse Bari - 2013
    Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions.Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more.Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.

Introduction to Algorithms


Thomas H. Cormen - 1989
    Each chapter is relatively self-contained and can be used as a unit of study. The algorithms are described in English and in a pseudocode designed to be readable by anyone who has done a little programming. The explanations have been kept elementary without sacrificing depth of coverage or mathematical rigor.

97 Things Every Programmer Should Know: Collective Wisdom from the Experts


Kevlin Henney - 2010
    With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan

Scrum: The Art of Doing Twice the Work in Half the Time


Jeff Sutherland - 2014
    It already drives most of the world’s top technology companies. And now it’s starting to spread to every domain where leaders wrestle with complex projects. If you’ve ever been startled by how fast the world is changing, Scrum is one of the reasons why. Productivity gains of as much as 1200% have been recorded, and there’s no more lucid – or compelling – explainer of Scrum and its bright promise than Jeff Sutherland, the man who put together the first Scrum team more than twenty years ago. The thorny problem Jeff began tackling back then boils down to this: people are spectacularly bad at doing things with agility and efficiency. Best laid plans go up in smoke. Teams often work at cross purposes to each other. And when the pressure rises, unhappiness soars. Drawing on his experience as a West Point-educated fighter pilot, biometrics expert, early innovator of ATM technology, and V.P. of engineering or CTO at eleven different technology companies, Jeff began challenging those dysfunctional realities, looking for solutions that would have global impact. In this book you’ll journey to Scrum’s front lines where Jeff’s system of deep accountability, team interaction, and constant iterative improvement is, among other feats, bringing the FBI into the 21st century, perfecting the design of an affordable 140 mile per hour/100 mile per gallon car, helping NPR report fast-moving action in the Middle East, changing the way pharmacists interact with patients, reducing poverty in the Third World, and even helping people plan their weddings and accomplish weekend chores. Woven with insights from martial arts, judicial decision making, advanced aerial combat, robotics, and many other disciplines, Scrum is consistently riveting. But the most important reason to read this book is that it may just help you achieve what others consider unachievable – whether it be inventing a trailblazing technology, devising a new system of education, pioneering a way to feed the hungry, or, closer to home, a building a foundation for your family to thrive and prosper.

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Advanced Analytics with Spark


Sandy Ryza - 2015
    

Code Complete


Steve McConnell - 1993
    Now this classic book has been fully updated and revised with leading-edge practices--and hundreds of new code samples--illustrating the art and science of software construction. Capturing the body of knowledge available from research, academia, and everyday commercial practice, McConnell synthesizes the most effective techniques and must-know principles into clear, pragmatic guidance. No matter what your experience level, development environment, or project size, this book will inform and stimulate your thinking--and help you build the highest quality code. Discover the timeless techniques and strategies that help you: Design for minimum complexity and maximum creativity Reap the benefits of collaborative development Apply defensive programming techniques to reduce and flush out errors Exploit opportunities to refactor--or evolve--code, and do it safely Use construction practices that are right-weight for your project Debug problems quickly and effectively Resolve critical construction issues early and correctly Build quality into the beginning, middle, and end of your project

Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy


George Gilder - 2018
    Gilder says or writes is ever delivered at anything less than the fullest philosophical decibel... Mr. Gilder sounds less like a tech guru than a poet, and his words tumble out in a romantic cascade." “Google’s algorithms assume the world’s future is nothing more than the next moment in a random process. George Gilder shows how deep this assumption goes, what motivates people to make it, and why it’s wrong: the future depends on human action.” — Peter Thiel, founder of PayPal and Palantir Technologies and author of Zero to One: Notes on Startups, or How to Build the Future The Age of Google, built on big data and machine intelligence, has been an awesome era. But it’s coming to an end. In Life after Google, George Gilder—the peerless visionary of technology and culture—explains why Silicon Valley is suffering a nervous breakdown and what to expect as the post-Google age dawns. Google’s astonishing ability to “search and sort” attracts the entire world to its search engine and countless other goodies—videos, maps, email, calendars….And everything it offers is free, or so it seems. Instead of paying directly, users submit to advertising. The system of “aggregate and advertise” works—for a while—if you control an empire of data centers, but a market without prices strangles entrepreneurship and turns the Internet into a wasteland of ads. The crisis is not just economic. Even as advances in artificial intelligence induce delusions of omnipotence and transcendence, Silicon Valley has pretty much given up on security. The Internet firewalls supposedly protecting all those passwords and personal information have proved hopelessly permeable. The crisis cannot be solved within the current computer and network architecture. The future lies with the “cryptocosm”—the new architecture of the blockchain and its derivatives. Enabling cryptocurrencies such as bitcoin and ether, NEO and Hashgraph, it will provide the Internet a secure global payments system, ending the aggregate-and-advertise Age of Google. Silicon Valley, long dominated by a few giants, faces a “great unbundling,” which will disperse computer power and commerce and transform the economy and the Internet. Life after Google is almost here.   For fans of "Wealth and Poverty," "Knowledge and Power," and "The Scandal of Money."