Book picks similar to
The Art of Data Science: A Guide for Anyone Who Works with Data by Roger D. Peng
data-science
data
tech
science
Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers
John MacCormick - 2012
A simple web search picks out a handful of relevant needles from the world's biggest haystack: the billions of pages on the World Wide Web. Uploading a photo to Facebook transmits millions of pieces of information over numerous error-prone network links, yet somehow a perfect copy of the photo arrives intact. Without even knowing it, we use public-key cryptography to transmit secret information like credit card numbers; and we use digital signatures to verify the identity of the websites we visit. How do our computers perform these tasks with such ease? This is the first book to answer that question in language anyone can understand, revealing the extraordinary ideas that power our PCs, laptops, and smartphones. Using vivid examples, John MacCormick explains the fundamental "tricks" behind nine types of computer algorithms, including artificial intelligence (where we learn about the "nearest neighbor trick" and "twenty questions trick"), Google's famous PageRank algorithm (which uses the "random surfer trick"), data compression, error correction, and much more. These revolutionary algorithms have changed our world: this book unlocks their secrets, and lays bare the incredible ideas that our computers use every day.
Planning for Big Data
Edd Wilder-James - 2004
From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work
Discovering Statistics Using R
Andy Field - 2012
Like its sister textbook, Discovering Statistics Using R is written in an irreverent style and follows the same ground-breaking structure and pedagogical approach. The core material is enhanced by a cast of characters to help the reader on their way, hundreds of examples, self-assessment tests to consolidate knowledge, and additional website material for those wanting to learn more.
Probably Approximately Correct: Nature's Algorithms for Learning and Prospering in a Complex World
Leslie Valiant - 2013
We nevertheless muddle through even in the absence of theories of how to act. But how do we do it?In Probably Approximately Correct, computer scientist Leslie Valiant presents a masterful synthesis of learning and evolution to show how both individually and collectively we not only survive, but prosper in a world as complex as our own. The key is “probably approximately correct” algorithms, a concept Valiant developed to explain how effective behavior can be learned. The model shows that pragmatically coping with a problem can provide a satisfactory solution in the absence of any theory of the problem. After all, finding a mate does not require a theory of mating. Valiant’s theory reveals the shared computational nature of evolution and learning, and sheds light on perennial questions such as nature versus nurture and the limits of artificial intelligence.Offering a powerful and elegant model that encompasses life’s complexity, Probably Approximately Correct has profound implications for how we think about behavior, cognition, biological evolution, and the possibilities and limits of human and machine intelligence.
Bayesian Reasoning and Machine Learning
David Barber - 2012
They are established tools in a wide range of industrial applications, including search engines, DNA sequencing, stock market analysis, and robot locomotion, and their use is spreading rapidly. People who know the methods have their choice of rewarding jobs. This hands-on text opens these opportunities to computer science students with modest mathematical backgrounds. It is designed for final-year undergraduates and master's students with limited background in linear algebra and calculus. Comprehensive and coherent, it develops everything from basic reasoning to advanced techniques within the framework of graphical models. Students learn more than a menu of techniques, they develop analytical and problem-solving skills that equip them for the real world. Numerous examples and exercises, both computer based and theoretical, are included in every chapter. Resources for students and instructors, including a MATLAB toolbox, are available online.
Algorithms of Oppression: How Search Engines Reinforce Racism
Safiya Umoja Noble - 2018
But, if you type in "white girls," the results are radically different. The suggested porn sites and un-moderated discussions about "why black women are so sassy" or "why black women are so angry" presents a disturbing portrait of black womanhood in modern society.In Algorithms of Oppression, Safiya Umoja Noble challenges the idea that search engines like Google offer an equal playing field for all forms of ideas, identities, and activities. Data discrimination is a real social problem; Noble argues that the combination of private interests in promoting certain sites, along with the monopoly status of a relatively small number of Internet search engines, leads to a biased set of search algorithms that privilege whiteness and discriminate against people of color, specifically women of color.Through an analysis of textual and media searches as well as extensive research on paid online advertising, Noble exposes a culture of racism and sexism in the way discoverability is created online. As search engines and their related companies grow in importance - operating as a source for email, a major vehicle for primary and secondary school learning, and beyond - understanding and reversing these disquieting trends and discriminatory practices is of utmost importance.An original, surprising and, at times, disturbing account of bias on the internet, Algorithms of Oppression contributes to our understanding of how racism is created, maintained, and disseminated in the 21st century.
Genetic Algorithms in Search, Optimization, and Machine Learning
David Edward Goldberg - 1989
Major concepts are illustrated with running examples, and major algorithms are illustrated by Pascal computer programs. No prior knowledge of GAs or genetics is assumed, and only a minimum of computer programming and mathematics background is required. 0201157675B07092001
Pro Git
Scott Chacon - 2009
It took the open source world by storm since its inception in 2005, and is used by small development shops and giants like Google, Red Hat, and IBM, and of course many open source projects.A book by Git experts to turn you into a Git expert. Introduces the world of distributed version control Shows how to build a Git development workflow.
The Art of Computer Programming, Volume 1: Fundamental Algorithms
Donald Ervin Knuth - 1973
-Byte, September 1995 I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me! I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up. -Charles Long If you think you're a really good programmer... read [Knuth's] Art of Computer Programming... You should definitely send me a resume if you can read the whole thing. -Bill Gates It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers. -Jonathan Laventhol This first volume in the series begins with basic programming concepts and techniques, then focuses more particularly on information structures-the representation of information inside a computer, the structural relationships between data elements and how to deal with them efficiently. Elementary applications are given to simulation, numerical methods, symbolic computing, software and system design. Dozens of simple and important algorithms and techniques have been added to those of the previous edition. The section on mathematical preliminaries has been extensively revised to match present trends in research. Ebook (PDF version) produced by Mathematical Sciences Publishers (MSP), http: //msp.org
Infonomics: How to Monetize, Manage, and Measure Information as an Asset for Competitive Advantage
Douglas B. Laney - 2017
They report to the board on the health of their workforce, their financials, their customers, and their partnerships, but rarely the health of their information assets. Corporations typically exhibit greater discipline in tracking and accounting for their office furniture than their data. Infonomics is the theory, study, and discipline of asserting economic significance to information. It strives to apply both economic and asset management principles and practices to the valuation, handling, and deployment of information assets. This book specifically shows:
CEOs and business leaders how to more fully wield information as a corporate asset
CIOs how to improve the flow and accessibility of information
CFOs how to help their organizations measure the actual and latent value in their information assets.
More directly, this book is for the burgeoning force of chief data officers (CDOs) and other information and analytics leaders in their valiant struggle to help their organizations become more infosavvy. Author Douglas Laney has spent years researching and developing Infonomics and advising organizations on the infinite opportunities to monetize, manage, and measure information. This book delivers a set of new ideas, frameworks, evidence, and even approaches adapted from other disciplines on how to administer, wield, and understand the value of information. Infonomics can help organizations not only to better develop, sell, and market their offerings, but to transform their organizations altogether.
Python Tricks: A Buffet of Awesome Python Features
Dan Bader - 2017
Discover the “hidden gold” in Python’s standard library and start writing clean and Pythonic code today.
Who Should Read This Book:
If you’re wondering which lesser known parts in Python you should know about, you’ll get a roadmap with this book. Discover cool (yet practical!) Python tricks and blow your coworkers’ minds in your next code review.
If you’ve got experience with legacy versions of Python, the book will get you up to speed with modern patterns and features introduced in Python 3 and backported to Python 2.
If you’ve worked with other programming languages and you want to get up to speed with Python, you’ll pick up the idioms and practical tips you need to become a confident and effective Pythonista.
If you want to make Python your own and learn how to write clean and Pythonic code, you’ll discover best practices and little-known tricks to round out your knowledge.
What Python Developers Say About The Book:
"I kept thinking that I wished I had access to a book like this when I started learning Python many years ago." — Mariatta Wijaya, Python Core Developer"This book makes you write better Python code!" — Bob Belderbos, Software Developer at Oracle"Far from being just a shallow collection of snippets, this book will leave the attentive reader with a deeper understanding of the inner workings of Python as well as an appreciation for its beauty." — Ben Felder, Pythonista"It's like having a seasoned tutor explaining, well, tricks!" — Daniel Meyer, Sr. Desktop Administrator at Tesla Inc.
The Googlization of Everything: (And Why We Should Worry)
Siva Vaidhyanathan - 2010
Into this creative chaos came Google with its dazzling mission—“To organize the world’s information and make it universally accessible”—and its much-quoted motto, “Don’t be evil.” In this provocative book, Siva Vaidhyanathan examines the ways we have used and embraced Google—and the growing resistance to its expansion across the globe. He exposes the dark side of our Google fantasies, raising red flags about issues of intellectual property and the much-touted Google Book Search. He assesses Google’s global impact, particularly in China, and explains the insidious effect of Googlization on the way we think. Finally, Vaidhyanathan proposes the construction of an Internet ecosystem designed to benefit the whole world and keep one brilliant and powerful company from falling into the “evil” it pledged to avoid.
Introducing Python: Modern Computing in Simple Packages
Bill Lubanovic - 2013
In addition to giving a strong foundation in the language itself, Lubanovic shows how to use it for a range of applications in business, science, and the arts, drawing on the rich collection of open source packages developed by Python fans.It's impressive how many commercial and production-critical programs are written now in Python. Developed to be easy to read and maintain, it has proven a boon to anyone who wants applications that are quick to write but robust and able to remain in production for the long haul.This book focuses on the current version of Python, 3.x, while including sidebars about important differences with 2.x for readers who may have to deal with programs in that version.
Macroanalysis: Digital Methods and Literary History
Matthew L. Jockers - 2013
Jockers introduces readers to large-scale literary computing and the revolutionary potential of macroanalysis--a new approach to the study of the literary record designed for probing the digital-textual world as it exists today, in digital form and in large quantities. Using computational analysis to retrieve key words, phrases, and linguistic patterns across thousands of texts in digital libraries, researchers can draw conclusions based on quantifiable evidence regarding how literary trends are employed over time, across periods, within regions, or within demographic groups, as well as how cultural, historical, and societal linkages may bind individual authors, texts, and genres into an aggregate literary culture. Moving beyond the limitations of literary interpretation based on the "close-reading" of individual works, Jockers describes how this new method of studying large collections of digital material can help us to better understand and contextualize the individual works within those collections.
The Art of Doing Science and Engineering: Learning to Learn
Richard Hamming - 1996
By presenting actual experiences and analyzing them as they are described, the author conveys the developmental thought processes employed and shows a style of thinking that leads to successful results is something that can be learned. Along with spectacular successes, the author also conveys how failures contributed to shaping the thought processes. Provides the reader with a style of thinking that will enhance a person's ability to function as a problem-solver of complex technical issues. Consists of a collection of stories about the author's participation in significant discoveries, relating how those discoveries came about and, most importantly, provides analysis about the thought processes and reasoning that took place as the author and his associates progressed through engineering problems.