Find a book to read

Book picks similar to
Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from Your Data by Dipanjan Sarkar

reference

_ebook

linguistics

computers

The Little Book on CoffeeScript

Alex MacCaw - 2012

This little book shows JavaScript developers how to build superb web applications with CoffeeScript, the remarkable little language that’s gaining considerable interest.

Through example code, this guide demonstrates how CoffeeScript abstracts JavaScript, providing syntactical sugar and preventing many common errors. You’ll learn CoffeeScript’s syntax and idioms step by step, from basic variables and functions to complex comprehensions and classes.Written by Alex MacCaw, author of JavaScript Web Applications (O’Reilly), with contributions from CoffeeScript creator Jeremy Ashkenas, this book quickly teaches you best practices for using this language—not just on the client side, but for server-side applications as well. It’s time to take a ride with the little language that could.Discover how CoffeeScript’s syntax differs from JavaScriptLearn about features such as array comprehensions, destructuring assignments, and classesExplore CoffeeScript idioms and compare them to their JavaScript counterpartsCompile CoffeeScript files in static sites with the Cake build systemUse CommonJS modules to structure and deploy CoffeeScript client-side applicationsExamine JavaScript’s bad parts—including features CoffeeScript was able to fix

Taming Text: How to Find, Organize, and Manipulate It

Grant S. Ingersoll - 2011

It is no secret that the world is drowning in text and data.

This causes real problems for everyday users who need to make sense of all the information available, and for software engineers who want to make their text-based applications more useful and user-friendly. Whether building a search engine for a corporate website, automatically organizing email, or extracting important nuggets of information from the news, dealing with unstructured text can be daunting.Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. It explores how to automatically organize text, using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. This book gives examples illustrating each of these topics, as well as the foundations upon which they are built.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

The Alignment Problem: Machine Learning and Human Values

science

non-fiction

nonfiction

Brian Christian - 2020

A jaw-dropping exploration of everything that goes wrong when we build AI systems and the movement to fix them.

Today’s "machine-learning" systems, trained by data, are so effective that we’ve invited them to see and hear for us?and to make decisions on our behalf. But alarm bells are ringing. Recent years have seen an eruption of concern as the field of machine learning advances. When the systems we attempt to teach will not, in the end, do what we want or what we expect, ethical and potentially existential risks emerge. Researchers call this the alignment problem.Systems cull résumés until, years later, we discover that they have inherent gender biases. Algorithms decide bail and parole?and appear to assess Black and White defendants differently. We can no longer assume that our mortgage application, or even our medical tests, will be seen by human eyes. And as autonomous vehicles share our streets, we are increasingly putting our lives in their hands.The mathematical and computational models driving these changes range in complexity from something that can fit on a spreadsheet to a complex system that might credibly be called “artificial intelligence.” They are steadily replacing both human judgment and explicitly programmed software.In best-selling author Brian Christian’s riveting account, we meet the alignment problem’s “first-responders,” and learn their ambitious plan to solve it before our hands are completely off the wheel. In a masterful blend of history and on-the ground reporting, Christian traces the explosive growth in the field of machine learning and surveys its current, sprawling frontier. Readers encounter a discipline finding its legs amid exhilarating and sometimes terrifying progress. Whether they—and we—succeed or fail in solving the alignment problem will be a defining human story.The Alignment Problem offers an unflinching reckoning with humanity’s biases and blind spots, our own unstated assumptions and often contradictory goals. A dazzlingly interdisciplinary work, it takes a hard look not only at our technology but at our culture—and finds a story by turns harrowing and hopeful.

Hadoop: The Definitive Guide

Tom White - 2009

Hadoop: The Definitive Guide helps you harness the power of your data.

Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Text Mining with R: A Tidy Approach

Julia Silge - 2017

Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools.

With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media.Learn how to apply the tidy text format to NLPUse sentiment analysis to mine the emotional content of textIdentify a document's most important terms with frequency measurementsExplore relationships and connections between words with the ggraph and widyr packagesConvert back and forth between R's tidy and non-tidy text formatsUse topic modeling to classify document collections into natural groupsExamine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages

Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python

data-science

non-fiction

tech

Scott Hartshorn - 2016

Random Forests are one type of machine learning algorithm.

They are typically used to categorize something based on other data that you have. The purpose of this book is to help you understand how Random Forests work, as well as the different options that you have when using them to analyze a problem. Additionally, since Decision Trees are a fundamental part of Random Forests, this book explains how they work. This book is focused on understanding Random Forests at the conceptual level. Knowing how they work, why they work the way that they do, and what options are available to improve results. This book covers how Random Forests work in an intuitive way, and also explains the equations behind many of the functions, but it only has a small amount of actual code (in python). This book is focused on giving examples and providing analogies for the most fundamental aspects of how random forests and decision trees work. The reason is that those are easy to understand and they stick with you. There are also some really interesting aspects of random forests, such as information gain, feature importances, or out of bag error, that simply cannot be well covered without diving into the equations of how they work. For those the focus is providing the information in a straight forward and easy to understand way.

Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites

Matthew A. Russell - 2011

Want to tap the tremendous amount of valuable social data in Facebook, Twitter, LinkedIn, and Google+? This refreshed edition helps you discover who’s making connections with social media, what they’re talking about, and where they’re located.

You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.Get a straightforward synopsis of the social web landscapeUse adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+Learn how to employ easy-to-use Python tools to slice and dice the data you collectExplore social connections in microformats with the XHTML Friends NetworkApply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detectionBuild interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data." --Alex Martelli, Senior Staff Engineer, Google

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists

Philipp K. Janert - 2010

Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need.

With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

An Introduction to Statistical Learning: With Applications in R

Gareth James - 2013

An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years.

This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Build Your Own Database Driven Website Using PHP & MySQL

Kevin Yank - 2001

Build Your Own Database-Driven Website Using PHP & MySQL is a practical guide for first-time users of PHP & MySQL that teaches readers by creating a fully working Content Management System, Shopping Cart and other real-world applications.

There has been a marked increase in the adoption of PHP, most notably in the beginning to intermediate levels. PHP now boasts over 30% of the server side scripting market (Source: php.weblogs.com).The previous edition sold over 17,000 copies exclusively through Sitepoint.com alone. With the release of PHP 5, SitePoint have updated this bestseller to reflect best practice web development using PHP 5 and MySQL 4.The 3rd Edition includes more code examples and also a new bonus chapter on structured PHP Programming which introduces techniques for organizing real world PHP applications to avoid code duplication and ensure code is manageable and maintainable. The chapter introduces features like include files, user-defined function libraries and constants, which are combined to produce a fully functional access control system suitable for use on any PHP Website.

The Hundred-Page Machine Learning Book

machine-learning

data-science

computer-science

Andriy Burkov - 2019

Concise and to the point — the book can be read during a week.

During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.

Doing Data Science

Cathy O'Neil - 2013

Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground.

But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

The D Programming Language

Andrei Alexandrescu - 2010

"To the best of my knowledge, D offers an unprecedentedly adroit integration of several powerful programming paradigms: imperative, object-oriented, functional, and meta." --From the Foreword by Walter Bright"This is a book by a skilled author describing an interesting programming language.

I'm sure you'll find the read rewarding." --From the Foreword by Scott Meyers D is a programming language built to help programmers address the challenges of modern software development. It does so by fostering modules interconnected through precise interfaces, a federation of tightly integrated programming paradigms, language-enforced thread isolation, modular type safety, an efficient memory model, and more. The D Programming Language is an authoritative and comprehensive introduction to D. Reflecting the author's signature style, the writing is casual and conversational, but never at the expense of focus and pre-cision. It covers all aspects of the language (such as expressions, statements, types, functions, contracts, and modules), but it is much more than an enumeration of features. Inside the book you will find In-depth explanations, with idiomatic examples, for all language features How feature groups support major programming paradigms Rationale and best-use advice for each major feature Discussion of cross-cutting issues, such as error handling, contract programming, and concurrency Tables, figures, and "cheat sheets" that serve as a handy quick reference for day-to-day problem solving with D Written for the working programmer, The D Programming Language not only introduces the D language--it presents a compendium of good practices and idioms to help both your coding with D and your coding in general.

Learning the bash Shell

Cameron Newham - 1995

Learning the bash Shell, Third Edition, is the definitive guide to bash, the Free Software Foundation's "Bourne Again Shell." It's a freely available replacement for the UNIX Bourne shell, and is the shell of choice for users of Linux, Mac OS X, BSD, and other UNIX systems.You'll find this guide valuable whether you're interested in bash as a user interface or for its powerful programming capabilities.

This book will teach you how to use bash's advanced command-line features, such as command history, command-line editing, and command completion.This book also introduces shell programming,a skill no UNIX or Linus user should be without. The book demonstrates what you can do with bash's programming features. You'll learn about flow control, signal handling, and command-line processing and I/O. There is also a chapter on debugging your bash programs.Finally, Learning the bash Shell, Third Edition, shows you how to acquire, install, configure, and customize bash, and gives advice to system administrators managing bash for their user communities.This Third Edition covers all of the features of bash Version 3.0, while still applying to Versions 1.x and 2.x. It includes a debugger for the bash shell, both as an extended example and as a useful piece of working code. Since shell scripts are a significant part of many software projects, the book also discusses how to write maintainable shell scripts. And, of course, it discusses the many features that have been introduced to bash over the years: one-dimensional arrays, parameter expansion, pattern-matching operations, new commands, and security improvements.Unfailingly practical and packed with examples and questions for future study, Learning the bash Shell Third Edition is a valuable asset for Linux and other UNIX users.--back cover

The Quick Python Book

Naomi R. Ceder - 2000

The Quick Python Book, Second Edition, is a clear, concise introduction to Python 3, aimed at programmers new to Python.

This updated edition includes all the changes in Python 3, itself a significant shift from earlier versions of Python.The book begins with basic but useful programs that teach the core features of syntax, control flow, and data structures. It then moves to larger applications involving code management, object-oriented programming, web development, and converting code from earlier versions of Python.True to his audience of experienced developers, the author covers common programming language features concisely, while giving more detail to those features unique to Python.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Book picks similar toText Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from Your Data by Dipanjan Sarkar

The Little Book on CoffeeScript

Taming Text: How to Find, Organize, and Manipulate It

The Alignment Problem: Machine Learning and Human Values

Hadoop: The Definitive Guide

Text Mining with R: A Tidy Approach

Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python

Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists

An Introduction to Statistical Learning: With Applications in R

Build Your Own Database Driven Website Using PHP & MySQL

The Hundred-Page Machine Learning Book

Doing Data Science

The D Programming Language

Learning the bash Shell

The Quick Python Book

Book picks similar to
Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from Your Data by Dipanjan Sarkar