The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

The Craft of Research


Wayne C. Booth - 1995
    Seasoned researchers and educators Gregory G. Colomb and Joseph M. Williams present an updated third edition of their classic handbook, whose first and second editions were written in collaboration with the late Wayne C. Booth. The Craft of Research explains how to build an argument that motivates readers to accept a claim; how to anticipate the reservations of readers and to respond to them appropriately; and how to create introductions and conclusions that answer that most demanding question, “So what?” The third edition includes an expanded discussion of the essential early stages of a research task: planning and drafting a paper. The authors have revised and fully updated their section on electronic research, emphasizing the need to distinguish between trustworthy sources (such as those found in libraries) and less reliable sources found with a quick Web search. A chapter on warrants has also been thoroughly reviewed to make this difficult subject easier for researchers Throughout, the authors have preserved the amiable tone, the reliable voice, and the sense of directness that have made this book indispensable for anyone undertaking a research project.

Data Analysis Using Regression and Multilevel/Hierarchical Models


Andrew Gelman - 2006
    The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the concepts by working through scores of real data examples that have arisen from the authors' own applied research, with programming codes provided for each one. Topics covered include causal inference, including regression, poststratification, matching, regression discontinuity, and instrumental variables, as well as multilevel logistic regression and missing-data imputation. Practical tips regarding building, fitting, and understanding are provided throughout. Author resource page: http: //www.stat.columbia.edu/ gelman/arm/

Mining of Massive Datasets


Anand Rajaraman - 2011
    This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.

The Professor Is In: The Essential Guide To Turning Your Ph.D. Into a Job


Karen Kelsky - 2015
     into their ideal job   Each year tens of thousands of students will, after years of hard work and enormous amounts of money, earn their Ph.D. And each year only a small percentage of them will land a job that justifies and rewards their investment. For every comfortably tenured professor or well-paid former academic, there are countless underpaid and overworked adjuncts, and many more who simply give up in frustration.   Those who do make it share an important asset that separates them from the pack: they have a plan. They understand exactly what they need to do to set themselves up for success.  They know what really moves the needle in academic job searches, how to avoid the all-too-common mistakes that sink so many of their peers, and how to decide when to point their Ph.D. toward other, non-academic options.   Karen Kelsky has made it her mission to help readers join the select few who get the most out of their Ph.D. As a former tenured professor and department head who oversaw numerous academic job searches, she knows from experience exactly what gets an academic applicant a job. And as the creator of the popular and widely respected advice site The Professor is In, she has helped countless Ph.D.’s turn themselves into stronger applicants and land their dream careers.   Now, for the first time ever, Karen has poured all her best advice into a single handy guide that addresses the most important issues facing any Ph.D., including:   -When, where, and what to publish -Writing a foolproof grant application -Cultivating references and crafting the perfect CV -Acing the job talk and campus interview -Avoiding the adjunct trap -Making the leap to nonacademic work, when the time is right  The Professor Is In addresses all of these issues, and many more.

Doing Bayesian Data Analysis: A Tutorial Introduction with R and BUGS


John K. Kruschke - 2010
    Included are step-by-step instructions on how to carry out Bayesian data analyses.Download Link : readbux.com/download?i=0124058884            0124058884 Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan PDF by John Kruschke

How to Solve It: A New Aspect of Mathematical Method


George Pólya - 1944
    Polya, How to Solve It will show anyone in any field how to think straight. In lucid and appealing prose, Polya reveals how the mathematical method of demonstrating a proof or finding an unknown can be of help in attacking any problem that can be reasoned out--from building a bridge to winning a game of anagrams. Generations of readers have relished Polya's deft--indeed, brilliant--instructions on stripping away irrelevancies and going straight to the heart of the problem.

Statistical Methods for the Social Sciences


Alan Agresti - 1986
    No previous knowledge of statistics is assumed, and mathematical background is assumed to be minimal (lowest-level high-school algebra). This text may be used in a one or two course sequence. Such sequences are commonly required of social science graduate students in sociology, political science, and psychology. Students in geography, anthropology, journalism, and speech also are sometimes required to take at least one statistics course.

SPSS Survival Manual: A Step by Step Guide to Data Analysis Using SPSS for Windows


Julie Pallant - 2001
    It helps in the process of choosing the right statistical technique and includes a detailed guide to interpreting SPSS ouput.

The Visual Display of Quantitative Information


Edward R. Tufte - 1983
    Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.

Social Research Methods: Quantitative and Qualitative Approaches


W. Lawrence Neuman - 1991
    It provides dozens of new examples from actual research studies are used to provide illustrations of concepts and methods. Key terms are now called out and defined in boxes at the bottom of the pages where they appear, for easier study and review. In chapter 1, there are now separate descriptions and examples of the steps in the research process for quantitative and qualitative approaches, to underscore some of the fundamental differences. Chapter 2 has new discussions of participatory action research, instrumental and reflexive knowledge, the various audiences for social research findings, and researcher autonomy when research is commissioned. The discussion of social theories in Chapter 3 now covers levels of abstraction, and relationships among concepts

Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Experimental and Quasi-Experimental Designs for Generalized Causal Inference


William R. Shadish - 2001
    The book covers four major topics in field experimentation:

Essentials of Statistics for the Behavioral Sciences


Frederick J. Gravetter - 1991
    The authors take time to explain statistical procedures so that you can go beyond memorizing formulas and gain a conceptual understanding of statistics. The authors also take care to show you how having an understanding of statistical procedures will help you comprehend published findings and will lead you to become a savvy consumer of information. Known for its exceptional accuracy and examples, this text also has a complete supplements package to support your learning.

Artificial Intelligence: A Modern Approach


Stuart Russell - 1994
    The long-anticipated revision of this best-selling text offers the most comprehensive, up-to-date introduction to the theory and practice of artificial intelligence. *NEW-Nontechnical learning material-Accompanies each part of the book. *NEW-The Internet as a sample application for intelligent systems-Added in several places including logical agents, planning, and natural language. *NEW-Increased coverage of material - Includes expanded coverage of: default reasoning and truth maintenance systems, including multi-agent/distributed AI and game theory; probabilistic approaches to learning including EM; more detailed descriptions of probabilistic inference algorithms. *NEW-Updated and expanded exercises-75% of the exercises are revised, with 100 new exercises. *NEW-On-line Java software. *Makes it easy for students to do projects on the web using intelligent agents. *A unified, agent-based approach to AI-Organizes the material around the task of building intelligent agents. *Comprehensive, up-to-date coverage-Includes a unified view of the field organized around the rational decision making pa