Book picks similar to
The Data Revolution: Big Data, Open Data, Data Infrastructures & Their Consequences by Rob Kitchin
data
data-science
tech
non-fiction
SQL Queries for Mere Mortals: A Hands-on Guide to Data Manipulation in SQL
John L. Viescas - 2007
The authors have taken the mystery out of complex queries and explained principles and techniques with such clarity that a "Mere Mortal" will indeed be empowered to perform the superhuman. Do not walk past this book "--Graham Mandeno, Database Consultant""SQL Queries for Mere Mortals" provides a step-by-step, easy-to-read introduction to writing SQL queries. It includes hundreds of examples with detailed explanations. This book provides the tools you need to understand, modify, and create SQL queries"--Keith W. Hare, Convenor, ISO/IEC JTC1 SC32 WG3--the International SQL Standards Committee"I learned SQL primarily from the first edition of this book, and I am pleased to see a second edition of this book so that others can continue to benefit from its organized presentation of the language. Starting from how to design your tables so that SQL can be effective (a common problem for database beginners), and then continuing through the various aspects of SQL construction and capabilities, the reader can become a moderate expert upon completing the book and its samples. Learning how to convert a question in English into a meaningful SQL statement will greatly facilitate your mastery of the language. Numerous examples from real life will help you visualize how to use SQL to answer the questions about the data in your database. Just one of the "watch out for this trap" items will save you more than the cost of the book when you avoid that problem when writing your queries. I highly recommend this book if you want to tap the full potential of your database."--Kenneth D. Snell, Ph.D., Database Designer/Programmer"I don't think they do this in public schools any more, and it is a shame, but do you remember in the seventh and eighth grades when you learned to diagram a sentence? Those of you who do may no longer remember how you did it, but all of you do write better sentences because of it. John Viescas and Mike Hernandez must have remembered because they take everyday English queries and literally translate them into SQL. This is an important book for all database designers. It takes the complexity of mathematical Set Theory and of First Order Predicate Logic, as outlined in E. F. Codd's original treatise on relational database design, and makes it easy for anyone to understand. If you want an elementary- through intermediate-level course on SQL, this is the one book that is a requirement, no matter how many others you buy."--Arvin Meyer, MCP, MVP"Even in this day of wizards and code generators, successful database developers still require a sound knowledge of Structured Query Language (SQL, the standard language for communicating with most database systems). In this book, John and Mike do a marvelous job of making what's usually a dry and difficult subject come alive, presenting the material with humor in a logical manner, with plenty of relevant examples. I would say that this book should feature prominently in the collection on the bookshelf of all serious developers, except that I'm sure it'll get so much use that it won't spend much time on the shelf "-- Doug Steele, Microsoft Access Developer and author"Over the last several decades, SQL has evolved from a language known only to computer specialists to a widely used international standard of the computer industry. The number of new applications deployed each year using SQL now totals in the millions. If you are accessing corporate information from the Internet or from an internal network, you are probably using SQL. This new edition of "SQL Queries for Mere Mortals" helps new users learn the foundations of SQL queries, and is an essential reference guide for intermediate and advanced users.The accompanying CD contains five sample databases used for the example queries throughout the book in four different formats: Microsoft SQL Server 2000 and later, Microsoft Access 2000 and later, MySQL version 5.0 and later, and SQL scripts that can be used with most other implementations of the language.
Information Doesn't Want to Be Free: Laws for the Internet Age
Cory Doctorow - 2014
Can small artists still thrive in the Internet era? Can giant record labels avoid alienating their audiences? This is a book about the pitfalls and the opportunities that creative industries (and individuals) are confronting today — about how the old models have failed or found new footing, and about what might soon replace them. An essential read for anyone with a stake in the future of the arts, Information Doesn’t Want to Be Free offers a vivid guide to the ways creativity and the Internet interact today, and to what might be coming next.
R for Everyone: Advanced Analytics and Graphics
Jared P. Lander - 2013
R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community
Learning Python
Mark Lutz - 2003
Python is considered easy to learn, but there's no quicker way to mastery of the language than learning from an expert teacher. This edition of "Learning Python" puts you in the hands of two expert teachers, Mark Lutz and David Ascher, whose friendly, well-structured prose has guided many a programmer to proficiency with the language. "Learning Python," Second Edition, offers programmers a comprehensive learning tool for Python and object-oriented programming. Thoroughly updated for the numerous language and class presentation changes that have taken place since the release of the first edition in 1999, this guide introduces the basic elements of the latest release of Python 2.3 and covers new features, such as list comprehensions, nested scopes, and iterators/generators. Beyond language features, this edition of "Learning Python" also includes new context for less-experienced programmers, including fresh overviews of object-oriented programming and dynamic typing, new discussions of program launch and configuration options, new coverage of documentation sources, and more. There are also new use cases throughout to make the application of language features more concrete. The first part of "Learning Python" gives programmers all the information they'll need to understand and construct programs in the Python language, including types, operators, statements, classes, functions, modules and exceptions. The authors then present more advanced material, showing how Python performs common tasks by offering real applications and the libraries available for those applications. Each chapter ends with a series of exercises that will test your Python skills and measure your understanding."Learning Python," Second Edition is a self-paced book that allows readers to focus on the core Python language in depth. As you work through the book, you'll gain a deep and complete understanding of the Python language that will help you to understand the larger application-level examples that you'll encounter on your own. If you're interested in learning Python--and want to do so quickly and efficiently--then "Learning Python," Second Edition is your best choice.
Advances in Financial Machine Learning
Marcos López de Prado - 2018
Today, ML algorithms accomplish tasks that - until recently - only expert humans could perform. And finance is ripe for disruptive innovations that will transform how the following generations understand money and invest.In the book, readers will learn how to:Structure big data in a way that is amenable to ML algorithms Conduct research with ML algorithms on big data Use supercomputing methods and back test their discoveries while avoiding false positives Advances in Financial Machine Learning addresses real life problems faced by practitioners every day, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their individual setting.Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance.
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Pedro Domingos - 2015
In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.
Mastering Bitcoin: Unlocking Digital Cryptocurrencies
Andreas M. Antonopoulos - 2014
Whether you're building the next killer app, investing in a startup, or simply curious about the technology, this practical book is essential reading.Bitcoin, the first successful decentralized digital currency, is still in its infancy and it's already spawned a multi-billion dollar global economy. This economy is open to anyone with the knowledge and passion to participate. Mastering Bitcoin provides you with the knowledge you need (passion not included).This book includes:A broad introduction to bitcoin--ideal for non-technical users, investors, and business executivesAn explanation of the technical foundations of bitcoin and cryptographic currencies for developers, engineers, and software and systems architectsDetails of the bitcoin decentralized network, peer-to-peer architecture, transaction lifecycle, and security principlesOffshoots of the bitcoin and blockchain inventions, including alternative chains, currencies, and applicationsUser stories, analogies, examples, and code snippets illustrating key technical concepts
The Art of R Programming: A Tour of Statistical Software Design
Norman Matloff - 2011
No statistical knowledge is required, and your programming skills can range from hobbyist to pro.Along the way, you'll learn about functional and object-oriented programming, running mathematical simulations, and rearranging complex data into simpler, more useful formats. You'll also learn to: Create artful graphs to visualize complex data sets and functions Write more efficient code using parallel R and vectorization Interface R with C/C++ and Python for increased speed or functionality Find new R packages for text analysis, image manipulation, and more Squash annoying bugs with advanced debugging techniques Whether you're designing aircraft, forecasting the weather, or you just need to tame your data, The Art of R Programming is your guide to harnessing the power of statistical computing.
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
Hadley Wickham - 2016
This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible.
Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way.
You’ll learn how to:
Wrangle—transform your datasets into a form convenient for analysis
Program—learn powerful R tools for solving data problems with greater clarity and ease
Explore—examine your data, generate hypotheses, and quickly test them
Model—provide a low-dimensional summary that captures true "signals" in your dataset
Communicate—learn R Markdown for integrating prose, code, and results
Understanding Social Networks: Theories, Concepts, and Findings
Charles Kadushin - 2011
Understanding Social Networks fills that gap by explaining the big ideas that underlie the social network phenomenon. Written for those interested in this fast moving area but who are not mathematically inclined, it covers fundamental concepts, then discusses networks and their core themes in increasing order of complexity. Kadushin demystifies the concepts, theories, and findings developed by network experts. He selects material that serves as basic building blocks and examples of best practices that will allow the reader to understand and evaluate new developments as they emerge. Understanding Social Networks will be useful to social scientists who encounter social network research in their reading, students new to the network field, as well as managers, marketers, and others who constantly encounter social networks in their work.
The Data Detective: Ten Easy Rules to Make Sense of Statistics
Tim Harford - 2020
That’s a mistake, Tim Harford says in The Data Detective. We shouldn’t be suspicious of statistics—we need to understand what they mean and how they can improve our lives: they are, at heart, human behavior seen through the prism of numbers and are often “the only way of grasping much of what is going on around us.” If we can toss aside our fears and learn to approach them clearly—understanding how our own preconceptions lead us astray—statistics can point to ways we can live better and work smarter.As “perhaps the best popular economics writer in the world” (New Statesman), Tim Harford is an expert at taking complicated ideas and untangling them for millions of readers. In The Data Detective, he uses new research in science and psychology to set out ten strategies for using statistics to erase our biases and replace them with new ideas that use virtues like patience, curiosity, and good sense to better understand ourselves and the world. As a result, The Data Detective is a big-idea book about statistics and human behavior that is fresh, unexpected, and insightful.
What Is Data Science?
Mike Loukides - 2011
Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.
The Information: A History, a Theory, a Flood
James Gleick - 2011
The story of information begins in a time profoundly unlike our own, when every thought and utterance vanishes as soon as it is born. From the invention of scripts and alphabets to the long-misunderstood talking drums of Africa, Gleick tells the story of information technologies that changed the very nature of human consciousness. He provides portraits of the key figures contributing to the inexorable development of our modern understanding of information: Charles Babbage, the idiosyncratic inventor of the first great mechanical computer; Ada Byron, the brilliant and doomed daughter of the poet, who became the first true programmer; pivotal figures like Samuel Morse and Alan Turing; and Claude Shannon, the creator of information theory itself. And then the information age arrives. Citizens of this world become experts willy-nilly: aficionados of bits and bytes. And we sometimes feel we are drowning, swept by a deluge of signs and signals, news and images, blogs and tweets. The Information is the story of how we got here and where we are heading.
Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor
Virginia Eubanks - 2018
In Pittsburgh, a child welfare agency uses a statistical model to try to predict which children might be future victims of abuse or neglect.Since the dawn of the digital age, decision-making in finance, employment, politics, health and human services has undergone revolutionary change. Today, automated systems—rather than humans—control which neighborhoods get policed, which families attain needed resources, and who is investigated for fraud. While we all live under this new regime of data, the most invasive and punitive systems are aimed at the poor.In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America. The book is full of heart-wrenching and eye-opening stories, from a woman in Indiana whose benefits are literally cut off as she lays dying to a family in Pennsylvania in daily fear of losing their daughter because they fit a certain statistical profile.The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute. Like the county poorhouse and scientific charity before them, digital tracking and automated decision-making hide poverty from the middle-class public and give the nation the ethical distance it needs to make inhumane choices: which families get food and which starve, who has housing and who remains homeless, and which families are broken up by the state. In the process, they weaken democracy and betray our most cherished national values.This deeply researched and passionate book could not be more timely.Naomi Klein: "This book is downright scary."Ethan Zuckerman, MIT: "Should be required reading."Dorothy Roberts, author of Killing the Black Body: "A must-read for everyone concerned about modern tools of inequality in America."Astra Taylor, author of The People's Platform: "This is the single most important book about technology you will read this year."