Book picks similar to
Outlier Analysis by Charu C. Aggarwal


data-science
mathematics
statistics
datascience

Ubuntu Linux Toolbox: 1000+ Commands for Ubuntu and Debian Power Users


Christopher Negus - 2007
    Try out more than 1,000 commands to find and get software, monitor system health and security, and access network resources. Then, apply the skills you learn from this book to use and administer desktops and servers running Ubuntu, Debian, and KNOPPIX or any other Linux distribution.

Statistical Inference


George Casella - 2001
    Starting from the basics of probability, the authors develop the theory of statistical inference using techniques, definitions, and concepts that are statistical and are natural extensions and consequences of previous concepts. This book can be used for readers who have a solid mathematics background. It can also be used in a way that stresses the more practical uses of statistical theory, being more concerned with understanding basic statistical concepts and deriving reasonable statistical procedures for a variety of situations, and less concerned with formal optimality investigations.

Probabilistic Robotics


Sebastian Thrun - 2005
    Building on the field of mathematical statistics, probabilistic robotics endows robots with a new level of robustness in real-world situations. This book introduces the reader to a wealth of techniques and algorithms in the field. All algorithms are based on a single overarching mathematical foundation. Each chapter provides example implementations in pseudo code, detailed mathematical derivations, discussions from a practitioner's perspective, and extensive lists of exercises and class projects. The book's Web site, www.probabilistic-robotics.org, has additional material. The book is relevant for anyone involved in robotic software development and scientific research. It will also be of interest to applied statisticians and engineers dealing with real-world sensor data.

Networks: An Introduction


M.E.J. Newman - 2010
    The rise of the Internet and the wide availability of inexpensive computers have made it possible to gather and analyze network data on a large scale, and the development of a variety of new theoretical tools has allowed us to extract new knowledge from many different kinds of networks.The study of networks is broadly interdisciplinary and important developments have occurred in many fields, including mathematics, physics, computer and information sciences, biology, and the social sciences. This book brings together for the first time the most important breakthroughs in each of these fields and presents them in a coherent fashion, highlighting the strong interconnections between work in different areas.Subjects covered include the measurement and structure of networks in many branches of science, methods for analyzing network data, including methods developed in physics, statistics, and sociology, the fundamentals of graph theory, computer algorithms, and spectral methods, mathematical models of networks, including random graph models and generative models, and theories of dynamical processes taking place on networks.

Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)


Jiawei Han - 2000
    Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site

Proofiness: The Dark Arts of Mathematical Deception


Charles Seife - 2010
     According to MSNBC, having a child makes you stupid. You actually lose IQ points. Good Morning America has announced that natural blondes will be extinct within two hundred years. Pundits estimated that there were more than a million demonstrators at a tea party rally in Washington, D.C., even though roughly sixty thousand were there. Numbers have peculiar powers-they can disarm skeptics, befuddle journalists, and hoodwink the public into believing almost anything. "Proofiness," as Charles Seife explains in this eye-opening book, is the art of using pure mathematics for impure ends, and he reminds readers that bad mathematics has a dark side. It is used to bring down beloved government officials and to appoint undeserving ones (both Democratic and Republican), to convict the innocent and acquit the guilty, to ruin our economy, and to fix the outcomes of future elections. This penetrating look at the intersection of math and society will appeal to readers of Freakonomics and the books of Malcolm Gladwell.

Econometric Analysis of Cross Section and Panel Data


Jeffrey M. Wooldridge - 2001
    The book makes clear that applied microeconometrics is about the estimation of marginal and treatment effects, and that parametric estimation is simply a means to this end. It also clarifies the distinction between causality and statistical association. The book focuses specifically on cross section and panel data methods. Population assumptions are stated separately from sampling assumptions, leading to simple statements as well as to important insights. The unified approach to linear and nonlinear models and to cross section and panel data enables straightforward coverage of more advanced methods. The numerous end-of-chapter problems are an important component of the book. Some problems contain important points not fully described in the text, and others cover new ideas that can be analyzed using tools presented in the current and previous chapters. Several problems require the use of the data sets located at the author's website.

Discovering Statistics Using R


Andy Field - 2012
    Like its sister textbook, Discovering Statistics Using R is written in an irreverent style and follows the same ground-breaking structure and pedagogical approach. The core material is enhanced by a cast of characters to help the reader on their way, hundreds of examples, self-assessment tests to consolidate knowledge, and additional website material for those wanting to learn more.

Social Network Analysis: Methods and Applications


Stanley Wasserman - 1994
    Social Network Analysis: Methods and Applications reviews and discusses methods for the analysis of social networks with a focus on applications of these methods to many substantive examples. As the first book to provide a comprehensive coverage of the methodology and applications of the field, this study is both a reference book and a textbook.

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences


Jacob Cohen - 1975
    Readers profit from its verbal-conceptual exposition and frequent use of examples.The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying website with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT, at www.psypress.com/9780805822236 .Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researchers in psychology, education, health sciences, communications, business, sociology, political science, anthropology, and economics. An introductory knowledge of statistics is required. Self-standing chapters minimize the need for researchers to refer to previous chapters.

The Numerati


Stephen Baker - 2008
    Now, in one of the greatest undertakings of the twenty-first century, a savvy group of mathematicians and computer scientists is beginning to sift through this data to dissect us and map out our next steps. Their goal? To manipulate our behavior -- what we buy, how we vote -- without our even realizing it.In this tour de force of original reporting and analysis, journalist Stephen Baker provides us with a fascinating guide to the world we're all entering -- and to the people controlling that world. The Numerati have infiltrated every realm of human affairs, profiling us as workers, shoppers, patients, voters, potential terrorists -- and lovers. The implications are vast. Our privacy evaporates. Our bosses can monitor and measure our every move (then reward or punish us). Politicians can find the swing voters among us, by plunking us all into new political groupings with names like "Hearth Keepers" and "Crossing Guards." It can sound scary. But the Numerati can also work on our behalf, diagnosing an illness before we're aware of the symptoms, or even helping us find our soul mate. Surprising, enlightening, and deeply relevant, The Numerati shows how a powerful new endeavor -- the mathematical modeling of humanity -- will transform every aspect of our lives. STEPHEN BAKER has written for BusinessWeek for over twenty years, covering Mexico and Latin America, the Rust Belt, European technology, and a host of other topics, including blogs, math, and nanotechnology. But he's always considered himself a foreign correspondent. This, he says, was especially useful as he met the Numerati. "While I came from the world of words, they inhabited the symbolic realms of math and computer science. This was foreign to me. My reporting became an anthropological mission." Baker has written for many publications, including the Wall Street Journal, the Los Angeles Times, and the Boston Globe. He won an Overseas Press Club Award for his portrait of the rising Mexican auto industry. He is the coauthor of blogspotting.net, featured by the New York Times as one of fifty blogs to watch.

Time Series Analysis


James Douglas Hamilton - 1994
    This book synthesizes these recent advances and makes them accessible to first-year graduate students. James Hamilton provides the first adequate text-book treatments of important innovations such as vector autoregressions, generalized method of moments, the economic and statistical consequences of unit roots, time-varying variances, and nonlinear time series models. In addition, he presents basic tools for analyzing dynamic systems (including linear representations, autocovariance generating functions, spectral analysis, and the Kalman filter) in a way that integrates economic theory with the practical difficulties of analyzing and interpreting real-world data. Time Series Analysis fills an important need for a textbook that integrates economic theory, econometrics, and new results.The book is intended to provide students and researchers with a self-contained survey of time series analysis. It starts from first principles and should be readily accessible to any beginning graduate student, while it is also intended to serve as a reference book for researchers.-- "Journal of Economics"

An Introduction to Systems Biology: Design Principles of Biological Circuits


Uri Alon - 2006
    It provides a simple mathematical framework which can be used to understand and even design biological circuits. The textavoids specialist terms, focusing instead on several well-studied biological systems that concisely demonstrate key principles. An Introduction to Systems Biology: Design Principles of Biological Circuits builds a solid foundation for the intuitive understanding of general principles. It encourages the reader to ask why a system is designed in a particular way and then proceeds to answer with simplified models.

Machine Learning: The Art and Science of Algorithms That Make Sense of Data


Peter Flach - 2012
    Peter Flach's clear, example-based approach begins by discussing how a spam filter works, which gives an immediate introduction to machine learning in action, with a minimum of technical fuss. Flach provides case studies of increasing complexity and variety with well-chosen examples and illustrations throughout. He covers a wide range of logical, geometric and statistical models and state-of-the-art topics such as matrix factorisation and ROC analysis. Particular attention is paid to the central role played by features. The use of established terminology is balanced with the introduction of new and useful concepts, and summaries of relevant background material are provided with pointers for revision if necessary. These features ensure Machine Learning will set a new standard as an introductory textbook.

The Elements of Data Analytic Style


Jeffrey Leek - 2015
    This book is focused on the details of data analysis that sometimes fall through the cracks in traditional statistics classes and textbooks. It is based in part on the authors blog posts, lecture materials, and tutorials. The author is one of the co-developers of the Johns Hopkins Specialization in Data Science the largest data science program in the world that has enrolled more than 1.76 million people. The book is useful as a companion to introductory courses in data science or data analysis. It is also a useful reference tool for people tasked with reading and critiquing data analyses. It is based on the authors popular open-source guides available through his Github account (https://github.com/jtleek). The paper is also available through Leanpub (https://leanpub.com/datastyle), if the book is purchased on that platform you are entitled to lifetime free updates.