Book picks similar to
Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery by Graham J. Williams
data-science
data-mining
quant
statistics
Machine Learning: A Probabilistic Perspective
Kevin P. Murphy - 2012
Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Think Like a Programmer: An Introduction to Creative Problem Solving
V. Anton Spraul - 2012
In this one-of-a-kind text, author V. Anton Spraul breaks down the ways that programmers solve problems and teaches you what other introductory books often ignore: how to Think Like a Programmer. Each chapter tackles a single programming concept, like classes, pointers, and recursion, and open-ended exercises throughout challenge you to apply your knowledge. You'll also learn how to:Split problems into discrete components to make them easier to solve Make the most of code reuse with functions, classes, and libraries Pick the perfect data structure for a particular job Master more advanced programming tools like recursion and dynamic memory Organize your thoughts and develop strategies to tackle particular types of problems Although the book's examples are written in C++, the creative problem-solving concepts they illustrate go beyond any particular language; in fact, they often reach outside the realm of computer science. As the most skillful programmers know, writing great code is a creative art—and the first step in creating your masterpiece is learning to Think Like a Programmer.
Hadoop: The Definitive Guide
Tom White - 2009
Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!
The Hundred-Page Machine Learning Book
Andriy Burkov - 2019
During that week, you will learn almost everything modern machine learning has to offer. The author and other practitioners have spent years learning these concepts.Companion wiki — the book has a continuously updated wiki that extends some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources.Flexible price and formats — choose from a variety of formats and price options: Kindle, hardcover, paperback, EPUB, PDF. If you buy an EPUB or a PDF, you decide the price you pay!Read first, buy later — download book chapters for free, read them and share with your friends and colleagues. Only if you liked the book or found it useful in your work, study or business, then buy it.
Hello World: Being Human in the Age of Algorithms
Hannah Fry - 2018
It’s time we stand face-to-digital-face with the true powers and limitations of the algorithms that already automate important decisions in healthcare, transportation, crime, and commerce. Hello World is indispensable preparation for the moral quandaries of a world run by code, and with the unfailingly entertaining Hannah Fry as our guide, we’ll be discussing these issues long after the last page is turned.
Numsense! Data Science for the Layman: No Math Added
Annalyn Ng - 2017
Sold in over 85 countries and translated into more than 5 languages.---------------Want to get started on data science?Our promise: no math added.This book has been written in layman's terms as a gentle introduction to data science and its algorithms. Each algorithm has its own dedicated chapter that explains how it works, and shows an example of a real-world application. To help you grasp key concepts, we stick to intuitive explanations and visuals.Popular concepts covered include:- A/B Testing- Anomaly Detection- Association Rules- Clustering- Decision Trees and Random Forests- Regression Analysis- Social Network Analysis- Neural NetworksFeatures:- Intuitive explanations and visuals- Real-world applications to illustrate each algorithm- Point summaries at the end of each chapter- Reference sheets comparing the pros and cons of algorithms- Glossary list of commonly-used termsWith this book, we hope to give you a practical understanding of data science, so that you, too, can leverage its strengths in making better decisions.
Naked Statistics: Stripping the Dread from the Data
Charles Wheelan - 2012
How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.
Machine Learning
Tom M. Mitchell - 1986
Mitchell covers the field of machine learning, the study of algorithms that allow computer programs to automatically improve through experience and that automatically infer general laws from specific data.
Foundations of Statistical Natural Language Processing
Christopher D. Manning - 1999
This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Hadoop Explained
Aravind Shenoy - 2014
Hadoop allowed small and medium sized companies to store huge amounts of data on cheap commodity servers in racks. The introduction of Big Data has allowed businesses to make decisions based on quantifiable analysis. Hadoop is now implemented in major organizations such as Amazon, IBM, Cloudera, and Dell to name a few. This book introduces you to Hadoop and to concepts such as ‘MapReduce’, ‘Rack Awareness’, ‘Yarn’ and ‘HDFS Federation’, which will help you get acquainted with the technology.
Taming Text: How to Find, Organize, and Manipulate It
Grant S. Ingersoll - 2011
This causes real problems for everyday users who need to make sense of all the information available, and for software engineers who want to make their text-based applications more useful and user-friendly. Whether building a search engine for a corporate website, automatically organizing email, or extracting important nuggets of information from the news, dealing with unstructured text can be daunting.Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. It explores how to automatically organize text, using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. This book gives examples illustrating each of these topics, as well as the foundations upon which they are built.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.
Laravel: Code Bright
Dayle Rees - 2013
At $29 and cheaper than a good pizza, you will get the book in its current partial form, along with all future chapters, updates, and fixes for free. As of the day I wrote this description, Code Bright had 130 pages and was just getting started. To give you some perspective on how detailed it is, Code Happy was 127 pages in its complete state. Want to know more? Carry on reading.Welcome back to Laravel. Last year I wrote a book about the Laravel PHP framework. It started as a collection of tutorials on my blog, and eventually became a full book. I definitely didn’t expect it to be as popular as it was. Code Happy has sold almost 3000 copies, and is considered to be one of the most valuable resourcesfor learning the Laravel framework.Code Bright is the spiritual successor to Code Happy. The framework has grown a lot in the past year, and has changed enough to merit a new title. With Code Bright I hope to improve on Code Happy with every way, my goal is, to once again, build the most comprehensive learning experience for the framework. Oh, and to still be funny. That’s very important to me.Laravel Code Bright will contain a complete learning experience for all of the framework’s features. The style of writing will make it approachable for beginners, and a wonderful reference resource for experienced developers alike.You see, people have told me that they enjoyed reading Code Happy, not only for its educational content, but for its humour, and for my down to earth writing style. This is very important to me. I like to write my books as if we were having a conversation in a bar.When I wrote Code Happy last year, I was simply a framework enthusiast. One of the first to share information about the framework. However, since then I have become a committed member of the core development team. Working directly with the framework author to make Laravel a wonderful experience for the developers of the world.One other important feature of both books, is that they are published while in progress. This means that the book is available in an incomplete state, but will grow over time into a complete title. All future updates will be provided for free.What this means is that I don’t have to worry about deadlines, or a fixed point of completion. It leads to less stress and better writing. If I think of a better way to explain something, I can go back and change it. In a sense, the book will never be completed. I can constantly add more information to it, until it becomes the perfect resource.Given that this time I am using the majority of my spare time to write the title (yes, I have a full time job too!), I have raised the price a little to justify my invested time. I was told by many of my past readers that they found the previous title very cheap for the resource that it grew into, so if you are worried about the new price, then let me remind you what you will get for your 29 bucks.The successor to Code Happy, seen by many as the #1 learning resource for the Laravel PHP framework.An unending source of information, chapters will be constantly added as needed until the book becomes a giant vault of framework knowledge.Comedy, and a little cheesy, but very friendly writing.
Bayesian Data Analysis
Andrew Gelman - 1995
Its world-class authors provide guidance on all aspects of Bayesian data analysis and include examples of real statistical analyses, based on their own research, that demonstrate how to solve complicated problems. Changes in the new edition include:Stronger focus on MCMC Revision of the computational advice in Part III New chapters on nonlinear models and decision analysis Several additional applied examples from the authors' recent research Additional chapters on current models for Bayesian data analysis such as nonlinear models, generalized linear mixed models, and more Reorganization of chapters 6 and 7 on model checking and data collectionBayesian computation is currently at a stage where there are many reasonable ways to compute any given posterior distribution. However, the best approach is not always clear ahead of time. Reflecting this, the new edition offers a more pluralistic presentation, giving advice on performing computations from many perspectives while making clear the importance of being aware that there are different ways to implement any given iterative simulation computation. The new approach, additional examples, and updated information make Bayesian Data Analysis an excellent introductory text and a reference that working scientists will use throughout their professional life.
Python Programming for Beginners: An Introduction to the Python Computer Language and Computer Programming (Python, Python 3, Python Tutorial)
Jason Cannon - 2014
There can be so much information available that you can't even decide where to start. Or worse, you start down the path of learning and quickly discover too many concepts, commands, and nuances that aren't explained. This kind of experience is frustrating and leaves you with more questions than answers.Python Programming for Beginners doesn't make any assumptions about your background or knowledge of Python or computer programming. You need no prior knowledge to benefit from this book. You will be guided step by step using a logical and systematic approach. As new concepts, commands, or jargon are encountered they are explained in plain language, making it easy for anyone to understand. Here is what you will learn by reading Python Programming for Beginners:
When to use Python 2 and when to use Python 3.
How to install Python on Windows, Mac, and Linux. Screenshots included.
How to prepare your computer for programming in Python.
The various ways to run a Python program on Windows, Mac, and Linux.
Suggested text editors and integrated development environments to use when coding in Python.
How to work with various data types including strings, lists, tuples, dictionaries, booleans, and more.
What variables are and when to use them.
How to perform mathematical operations using Python.
How to capture input from a user.
Ways to control the flow of your programs.
The importance of white space in Python.
How to organize your Python programs -- Learn what goes where.
What modules are, when you should use them, and how to create your own.
How to define and use functions.
Important built-in Python functions that you'll use often.
How to read from and write to files.
The difference between binary and text files.
Various ways of getting help and find Python documentation.
Much more...
Every single code example in the book is available to download, providing you with all the Python code you need at your fingertips! Scroll up, click the Buy Now With 1 Click button and get started learning Python today!
Dear Data
Giorgia Lupi - 2016
The result is described as “a thought-provoking visual feast”.