Book picks similar to
Development Workflows for Data Scientists: Enabling Fast, Efficient, and Reproducible Results for Data Science Teams by Ciara Byrne
data-science
data
computer-science
development
Pattern Recognition and Machine Learning
Christopher M. Bishop - 2006
However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Rob Collie - 2016
Written by the world’s foremost PowerPivot blogger and practitioner, the book’s concepts and approach are introduced in a simple, step-by-step manner tailored to the learning style of Excel users everywhere. The techniques presented allow users to produce, in hours or even minutes, results that formerly would have taken entire teams weeks or months to produce. It includes lessons on the difference between calculated columns and measures; how formulas can be reused across reports of completely different shapes; how to merge disjointed sets of data into unified reports; how to make certain columns in a pivot behave as if the pivot were filtered while other columns do not; and how to create time-intelligent calculations in pivot tables such as “Year over Year” and “Moving Averages” whether they use a standard, fiscal, or a complete custom calendar. The “pattern-like” techniques and best practices contained in this book have been developed and refined over two years of onsite training with Excel users around the world, and the key lessons from those seminars costing thousands of dollars per day are now available to within the pages of this easy-to-follow guide. This updated second edition covers new features introduced with Office 2015.
The Art of Computer Programming, Volumes 1-4a Boxed Set
Donald Ervin Knuth - 2011
Scientists have marveled at the beauty and elegance of his analysis, while ordinary programmers have successfully applied his "cookbook" solutions to their day-to-day problems. All have admired Knuth for the breadth, clarity, accuracy, and good humor found in his books. "I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up.""--"Charles Long Primarily written as a reference, some people have nevertheless found it possible and interesting to read each volume from beginning to end. A programmer in China even compared the experience to reading a poem. "If you think you're a really good programmer... read Knuth's] "Art of Computer Programming.".. You should definitely send me a resume if you can read the whole thing.""--"Bill Gates Whatever your background, if you need to do any serious computer programming, you will find your own good reason to make each volume in this series a readily accessible part of your scholarly or professional library. "It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers.""--"Jonathan LaventholIn describing the new fourth volume, one reviewer listed the qualities that distinguish all of Knuth's work. In sum: ] "detailed coverage of the basics, illustrated with well-chosen examples; occasional forays into more esoteric topics and problems at the frontiers of research; impeccable writing peppered with occasional bits of humor; extensive collections of exercises, all with solutions or helpful hints; a careful attention to history; implementations of many of the algorithms in his classic step-by-step form."--Frank RuskeyThese four books comprise what easily could be the most important set of information on any serious programmer's bookshelf.
Mindf*ck: Cambridge Analytica and the Plot to Break America
Christopher Wylie - 2019
Bannon had long sensed that deep within America's soul lurked an explosive tension. Cambridge Analytica had the data to prove it, and in 2016 Bannon had a presidential campaign to use as his proving ground.Christopher Wylie might have seemed an unlikely figure to be at the center of such an operation. Canadian and liberal in his politics, he was only twenty-four when he got a job with a London firm that worked with the U.K. Ministry of Defense and was charged putatively with helping to build a team of data scientists to create new tools to identify and combat radical extremism online. In short order, those same military tools were turned to political purposes, and Cambridge Analytica was born. Wylie's decision to become a whistleblower prompted the largest data crime investigation in history. His story is both exposé and dire warning about a sudden problem born of very new and powerful capabilities. It has not only exposed the profound vulnerabilities and profound carelessness in the enormous companies that drive the attention economy, it has also exposed the profound vulnerabilities of democracy itself. What happened in 2016 was just a trial run. Ruthless actors are coming for your data, and they want to control what you think.
The Filter Bubble: What the Internet is Hiding From You
Eli Pariser - 2011
Instead of giving you the most broadly popular result, Google now tries to predict what you are most likely to click on. According to MoveOn.org board president Eli Pariser, Google's change in policy is symptomatic of the most significant shift to take place on the Web in recent years - the rise of personalization. In this groundbreaking investigation of the new hidden Web, Pariser uncovers how this growing trend threatens to control how we consume and share information as a society-and reveals what we can do about it.Though the phenomenon has gone largely undetected until now, personalized filters are sweeping the Web, creating individual universes of information for each of us. Facebook - the primary news source for an increasing number of Americans - prioritizes the links it believes will appeal to you so that if you are a liberal, you can expect to see only progressive links. Even an old-media bastion like "The Washington Post" devotes the top of its home page to a news feed with the links your Facebook friends are sharing. Behind the scenes a burgeoning industry of data companies is tracking your personal information to sell to advertisers, from your political leanings to the color you painted your living room to the hiking boots you just browsed on Zappos.In a personalized world, we will increasingly be typed and fed only news that is pleasant, familiar, and confirms our beliefs - and because these filters are invisible, we won't know what is being hidden from us. Our past interests will determine what we are exposed to in the future, leaving less room for the unexpected encounters that spark creativity, innovation, and the democratic exchange of ideas.While we all worry that the Internet is eroding privacy or shrinking our attention spans, Pariser uncovers a more pernicious and far-reaching trend on the Internet and shows how we can - and must - change course. With vivid detail and remarkable scope, The Filter Bubble reveals how personalization undermines the Internet's original purpose as an open platform for the spread of ideas and could leave us all in an isolated, echoing world.
Data, A Love Story: How I Gamed Online Dating to Meet My Match
Amy Webb - 2013
Most don’t find true love. Thanks to Data, a Love Story, their odds just got a whole lot better. Data, A Love Story: How I Gamed Online Dating to Meet My Match is a lively, thought-provoking memoir about how one woman “gamed” the world of online dating—and met her eventual husband.
Machine Learning in Action
Peter Harrington - 2011
"Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.
IoT Inc.: How Your Company Can Use the Internet of Things to Win in the Outcome Economy
Bruce Sinclair - 2017
They’re in our companies, in our homes, in our pockets. People love these products. But what they love more is what these products do—and for anyone running a business today, outcomes are the key. The Internet of Things (IoT) is the point of connection between products and the results they deliver—it’s where products become software. IoT Inc. explains everything you need to know to position your company within this powerful new network. And once you do, you’ll leave the competition in the dust. Founder and president of today’s leading IoT business consulting firm, Bruce Sinclair has been helping companies develop IoT strategies for a decade—far longer than the term has even existed. This essential guide provides an in-depth look into IoT—how it works and how it is transforming business; methods for seeing your own business, customers, and competitors through the lens of IoT, and a deep dive into how to develop and implement a powerful IoT strategy. IoT isn’t a new business trend. It’s the new way of business. Period. The IoT wave is heading for your industry. You can either meet it head-on, and ride it to success, or you can turn your back and let it swamp you. This is your playbook for transforming your company into a major player in the IoT Outcome economy.
Too Big to Ignore: The Business Case for Big Data
Phil Simon - 2013
Progressive Insurance tracks real-time customer driving patterns and uses that information to offer rates truly commensurate with individual safety. Google accurately predicts local flu outbreaks based upon thousands of user search queries. Amazon provides remarkably insightful, relevant, and timely product recommendations to its hundreds of millions of customers. Quantcast lets companies target precise audiences and key demographics throughout the Web. NASA runs contests via gamification site TopCoder, awarding prizes to those with the most innovative and cost-effective solutions to its problems. Explorys offers penetrating and previously unknown insights into healthcare behavior.How do these organizations and municipalities do it? Technology is certainly a big part, but in each case the answer lies deeper than that. Individuals at these organizations have realized that they don't have to be Nate Silver to reap massive benefits from today's new and emerging types of data. And each of these organizations has embraced Big Data, allowing them to make astute and otherwise impossible observations, actions, and predictions.It's time to start thinking big.In Too Big to Ignore, recognized technology expert and award-winning author Phil Simon explores an unassailably important trend: Big Data, the massive amounts, new types, and multifaceted sources of information streaming at us faster than ever. Never before have we seen data with the volume, velocity, and variety of today. Big Data is no temporary blip of fad. In fact, it is only going to intensify in the coming years, and its ramifications for the future of business are impossible to overstate.Too Big to Ignore explains why Big Data is a big deal. Simon provides commonsense, jargon-free advice for people and organizations looking to understand and leverage Big Data. Rife with case studies, examples, analysis, and quotes from real-world Big Data practitioners, the book is required reading for chief executives, company owners, industry leaders, and business professionals.
The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy
Sharon Bertsch McGrayne - 2011
To its adherents, it is an elegant statement about learning from experience. To its opponents, it is subjectivity run amok.In the first-ever account of Bayes' rule for general readers, Sharon Bertsch McGrayne explores this controversial theorem and the human obsessions surrounding it. She traces its discovery by an amateur mathematician in the 1740s through its development into roughly its modern form by French scientist Pierre Simon Laplace. She reveals why respected statisticians rendered it professionally taboo for 150 years—at the same time that practitioners relied on it to solve crises involving great uncertainty and scanty information (Alan Turing's role in breaking Germany's Enigma code during World War II), and explains how the advent of off-the-shelf computer technology in the 1980s proved to be a game-changer. Today, Bayes' rule is used everywhere from DNA de-coding to Homeland Security.Drawing on primary source material and interviews with statisticians and other scientists, The Theory That Would Not Die is the riveting account of how a seemingly simple theorem ignited one of the greatest controversies of all time.
Pragmatic Project Automation
Mike Clark - 2004
Indeed, that's what computers are for. You can enlist your own computer to automate all of your project's repetitive tasks, ranging from individual builds and running unit tests through to full product release, customer deployment, and monitoring the system.Many teams try to do these tasks by hand. That's usually a really bad idea: people just aren't as good at repetitive tasks as machines. You run the risk of doing it differently the one time it matters, on one machine but not another, or doing it just plain wrong. But the computer can do these tasks for you the same way, time after time, without bothering you. You can transform these labor-intensive, boring and potentially risky chores into automatic, background processes that just work.In this eagerly anticipated book, you'll find a variety of popular, open-source tools to help automate your project. With this book, you will learn: How to make your build processes accurate, reliable, fast, and easy. How to build complex systems at the touch of a button. How to build, test, and release software automatically, with no human intervention. Technologies and tools available for automation: which to use and when. Tricks and tips from the masters (do you know how to have your cell phone tell you that your build just failed?) You'll find easy-to-implement recipes to automate your Java project, using the same popular style as the rest of our Jolt Productivity Award-winning Starter Kit books. Armed with plenty of examples and concrete, pragmatic advice, you'll find it's easy to get started and reap the benefits of modern software development. You can begin to enjoy pragmatic, automatic, unattended software production that's reliable and accurate every time.
Schaum's Outline of Theory and Problems of Data Structures
Seymour Lipschutz - 1986
This guide, which can be used with any text or can stand alone, contains at the beginning of each chapter a list of key definitions, a summary of major concepts, step by step solutions to dozens of problems, and additional practice problems.
Learning SQL
Alan Beaulieu - 2005
If you're working with a relational database--whether you're writing applications, performing administrative tasks, or generating reports--you need to know how to interact with your data. Even if you are using a tool that generates SQL for you, such as a reporting tool, there may still be cases where you need to bypass the automatic generation feature and write your own SQL statements.To help you attain this fundamental SQL knowledge, look to "Learning SQL," an introductory guide to SQL, designed primarily for developers just cutting their teeth on the language."Learning SQL" moves you quickly through the basics and then on to some of the more commonly used advanced features. Among the topics discussed: The history of the computerized databaseSQL Data Statements--those used to create, manipulate, and retrieve data stored in your database; example statements include select, update, insert, and deleteSQL Schema Statements--those used to create database objects, such as tables, indexes, and constraintsHow data sets can interact with queriesThe importance of subqueriesData conversion and manipulation via SQL's built-in functionsHow conditional logic can be used in Data StatementsBest of all, "Learning SQL" talks to you in a real-world manner, discussing various platform differences that you're likely to encounter and offering a series of chapter exercises that walk you through the learning process. Whenever possible, the book sticks to the features included in the ANSI SQL standards. This means you'll be able to apply what you learn to any of several different databases; the book covers MySQL, Microsoft SQL Server, and Oracle Database, but the features and syntax should apply just as well (perhaps with some tweaking) to IBM DB2, Sybase Adaptive Server, and PostgreSQL.Put the power and flexibility of SQL to work. With "Learning SQL" you can master this important skill and know that the SQL statements you write are indeed correct.
Feature Engineering for Machine Learning
Alice Zheng - 2018
With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.