Find a book to read

Book picks similar to
Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzic

data-mining

techrefs

testing-analytics

Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python

data-science

non-fiction

tech

Scott Hartshorn - 2016

Random Forests are one type of machine learning algorithm.

They are typically used to categorize something based on other data that you have. The purpose of this book is to help you understand how Random Forests work, as well as the different options that you have when using them to analyze a problem. Additionally, since Decision Trees are a fundamental part of Random Forests, this book explains how they work. This book is focused on understanding Random Forests at the conceptual level. Knowing how they work, why they work the way that they do, and what options are available to improve results. This book covers how Random Forests work in an intuitive way, and also explains the equations behind many of the functions, but it only has a small amount of actual code (in python). This book is focused on giving examples and providing analogies for the most fundamental aspects of how random forests and decision trees work. The reason is that those are easy to understand and they stick with you. There are also some really interesting aspects of random forests, such as information gain, feature importances, or out of bag error, that simply cannot be well covered without diving into the equations of how they work. For those the focus is providing the information in a straight forward and easy to understand way.

R Packages

data-science

programming

data

Hadley Wickham - 2015

Turn your R code into packages that others can easily download and use.

This practical book shows you how to bundle reusable R functions, sample data, and documentation together by applying author Hadley Wickham’s package development philosophy. In the process, you’ll work with devtools, roxygen, and testthat, a set of R packages that automate common development tasks. Devtools encapsulates best practices that Hadley has learned from years of working with this programming language. Ideal for developers, data scientists, and programmers with various backgrounds, this book starts you with the basics and shows you how to improve your package writing over time. You’ll learn to focus on what you want your package to do, rather than think about package structure. Learn about the most useful components of an R package, including vignettes and unit tests Automate anything you can, taking advantage of the years of development experience embodied in devtools Get tips on good style, such as organizing functions into files Streamline your development process with devtools Learn the best way to submit your package to the Comprehensive R Archive Network (CRAN) Learn from a well-respected member of the R community who created 30 R packages, including ggplot2, dplyr, and tidyr

Data Smart: Using Data Science to Transform Information into Insight

John W. Foreman - 2013

Data Science gets thrown around in the press like it's magic.

Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Artificial Intelligence for Humans, Volume 1: Fundamental Algorithms

computer-science

artificial-intelligence

computer

Jeff Heaton - 2013

A great building requires a strong foundation.

This book teaches basic Artificial Intelligence algorithms such as dimensionality, distance metrics, clustering, error calculation, hill climbing, Nelder Mead, and linear regression. These are not just foundational algorithms for the rest of the series, but are very useful in their own right. The book explains all algorithms using actual numeric calculations that you can perform yourself. Artificial Intelligence for Humans is a book series meant to teach AI to those without an extensive mathematical background. The reader needs only a knowledge of basic college algebra or computer programming—anything more complicated than that is thoroughly explained. Every chapter also includes a programming example. Examples are currently provided in Java, C#, R, Python and C. Other languages planned.

Data Science For Dummies

Lillian Pierson - 2014

Discover how data science can help you gain in-depth insight into your business – the easy way! Jobs in data science abound, but few people have the data science skills needed to fill these increasingly important roles in organizations.

Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of their organization’s massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you’ll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals before moving on to working with relational databases and unstructured data and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It’s a big, big data world out there – let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

A History of Modern Computing

Paul E. Ceruzzi - 1998

From the first digital computer to the dot-com crash - a story of individuals, institutions, and the forces that led to a series of dramatic transformations.This engaging history covers modern computing from the development of the first electronic digital computer through the dot-com crash.

The author concentrates on five key moments of transition: the transformation of the computer in the late 1940s from a specialized scientific instrument to a commercial product; the emergence of small systems in the late 1960s; the beginning of personal computing in the 1970s; the spread of networking after 1985; and, in a chapter written for this edition, the period 1995-2001.The new material focuses on the Microsoft antitrust suit, the rise and fall of the dot-coms, and the advent of open source software, particularly Linux. Within the chronological narrative, the book traces several overlapping threads: the evolution of the computer's internal design; the effect of economic trends and the Cold War; the long-term role of IBM as a player and as a target for upstart entrepreneurs; the growth of software from a hidden element to a major character in the story of computing; and the recurring issue of the place of information and computing in a democratic society.The focus is on the United States (though Europe and Japan enter the story at crucial points), on computing per se rather than on applications such as artificial intelligence, and on systems that were sold commercially and installed in quantities.

Data Science from Scratch: First Principles with Python

Joel Grus - 2015

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science.

In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

A Practical Guide to Linux Commands, Editors, and Shell Programming

Mark G. Sobell - 2005

The essential reference for core commands that Linux users need daily, along with superior tutorial on shell programming and much more.

The book is a complete revision of the commands section of Sobell's Practical Guide to Linux - a proven best-seller. The book is Linux distribution and release agnostic. It will appeal to users of ALL Linux distributions. Superior examples make this book the the best option on the market! System administrators, software developers, quality assurance engineers and others working on a Linux system need to work from the command line in order to be effective. Linux is famous for its huge number of command line utility programs, and the programs themselves are famous for their large numbers of options, switches, and configuration files. But the truth is that users will only use a limited (but still significant) number of these utilities on a recurring basis, and then only with a subset of the most important and useful options, switches and configuration files. This book cuts through all the noise and shows them which utilities are most useful, and which options most important. And it contains examples, lot's and lot's of examples. programmability. Utilities are designed, by default, to work wtih other utilities within shell programs as a way of automating system tasks. This book contains a superb introduction to Linux shell programming. And since shell programmers need to write their programs in text editors, this book covers the two most popular ones: vi and emacs.

Data Science for Business: What you need to know about data mining and data-analytic thinking

Foster Provost - 2013

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect.

This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks

Will Kurt - 2019

Fun guide to learning Bayesian statistics and probability through unusual and illustrative examples.Probability and statistics are increasingly important in a huge range of professions.

But many people use data in ways they don't even understand, meaning they aren't getting the most from it. Bayesian Statistics the Fun Way will change that.This book will give you a complete understanding of Bayesian statistics through simple explanations and un-boring examples. Find out the probability of UFOs landing in your garden, how likely Han Solo is to survive a flight through an asteroid shower, how to win an argument about conspiracy theories, and whether a burglary really was a burglary, to name a few examples.By using these off-the-beaten-track examples, the author actually makes learning statistics fun. And you'll learn real skills, like how to:- How to measure your own level of uncertainty in a conclusion or belief- Calculate Bayes theorem and understand what it's useful for- Find the posterior, likelihood, and prior to check the accuracy of your conclusions- Calculate distributions to see the range of your data- Compare hypotheses and draw reliable conclusions from themNext time you find yourself with a sheaf of survey results and no idea what to do with them, turn to Bayesian Statistics the Fun Way to get the most value from your data.

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference

Cameron Davidson-Pilon - 2014

Master Bayesian Inference through Practical Examples and Computation-Without Advanced Mathematical Analysis Bayesian methods of inference are deeply natural and extremely powerful.

However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.

Python Machine Learning

Sebastian Raschka - 2015

Link to the GitHub Repository containing the code examples and additional material: https://github.com/rasbt/python-machi...Many of the most innovative breakthroughs and exciting new technologies can be attributed to applications of machine learning.

We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

The Practice of System and Network Administration

Thomas A. Limoncelli - 2001

The first edition of The Practice of System and Network Administration introduced a generation of system and network administrators to a modern IT methodology.

Whether you use Linux, Unix, or Windows, this newly revised edition describes the essential practices previously handed down only from mentor to protege. This wonderfully lucid, often funny cornucopia of information introduces beginners to advanced frameworks valuable for their entire career, yet is structured to help even the most advanced experts through difficult projects.The book's four major sections build your knowledge with the foundational elements of system administration. These sections guide you through better techniques for upgrades and change management, catalog best practices for IT services, and explore various management topics. Chapters are divided into The Basics and The Icing. When you get the Basics right it makes every other aspect of the job easier--such as automating the right things first. The Icing sections contain all the powerful things that can be done on top of the basics to wow customers and managers.Inside, you'll find advice on topics such asThe key elements your networks and systems need in order to make all other services run better Building and running reliable, scalable services, including web, storage, email, printing, and remote access Creating and enforcing security policies Upgrading multiple hosts at one time without creating havoc Planning for and performing flawless scheduled maintenance windows Managing superior helpdesks and customer care Avoiding the -temporary fix- trap Building data centers that improve server uptime Designing networks for speed and reliability Web scaling and security issues Why building a backup system isn't about backups Monitoring what you have and predicting what you will need How technically oriented workers can maintain their job's technical focus (and avoid an unwanted management role) Technical management issues, including morale, organization building, coaching, and maintaining positive visibility Personal skill techniques, including secrets for getting more done each day, ethical dilemmas, managing your boss, and loving your job System administration salary negotiation It's no wonder the first edition received Usenix SAGE's 2005 Outstanding Achievement Award!This eagerly anticipated second edition updates this time-proven classic:Chapters reordered for easier navigationThousands of updates and clarifications based on reader feedbackPlus three entirely new chapters: Web Services, Data Storage, and Documentation

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Trevor Hastie - 2001

During the past decade there has been an explosion in computation and information technology.

With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Iron and Blood

Auston Habershaw - 2015

Part II of the Saga of the Redeemed picks up right where The Iron Ring left off …After Tyvian Reldamar gets double-crossed by his business partner, he is affixed with an iron ring that prevents the wearer from any evildoing.

Not one to be deterred by this setback, he quickly puts into motion a plan for revenge—one that will use every dirty trick in the book.But things are never simple for mastermind Tyvian, especially not after he uncovers a sinister plot: evil wizard Banric Sahand is planning to decimate the city of Freegate. Now Tyvian must learn to work with—and rely on—his motley crew of accomplices, including an adolescent pickpocket, an obese secret-monger, a fearsome gnoll, and a Mage Defender…who is also trying to get him arrested. Time is running out for Tyvian's plan for revenge—while the fate of the city hangs in the balance.

Book picks similar toData Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzic

Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python

R Packages

Data Smart: Using Data Science to Transform Information into Insight

Artificial Intelligence for Humans, Volume 1: Fundamental Algorithms

Data Science For Dummies

A History of Modern Computing

Data Science from Scratch: First Principles with Python

A Practical Guide to Linux Commands, Editors, and Shell Programming

Data Science for Business: What you need to know about data mining and data-analytic thinking

Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference

Python Machine Learning

The Practice of System and Network Administration

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Iron and Blood

Book picks similar to
Data Mining: Concepts, Models, Methods, and Algorithms by Mehmed Kantardzic