Hadoop Explained


Aravind Shenoy - 2014
    Hadoop allowed small and medium sized companies to store huge amounts of data on cheap commodity servers in racks. The introduction of Big Data has allowed businesses to make decisions based on quantifiable analysis. Hadoop is now implemented in major organizations such as Amazon, IBM, Cloudera, and Dell to name a few. This book introduces you to Hadoop and to concepts such as ‘MapReduce’, ‘Rack Awareness’, ‘Yarn’ and ‘HDFS Federation’, which will help you get acquainted with the technology.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

A Manual for Writers of Research Papers, Theses, and Dissertations: Chicago Style for Students and Researchers


Kate L. Turabian - 1955
    Bellow. Strauss. Friedman. The University of Chicago has been the home of some of the most important thinkers of the modern age. But perhaps no name has been spoken with more respect than Turabian. The dissertation secretary at Chicago for decades, Kate Turabian literally wrote the book on the successful completion and submission of the student paper. Her Manual for Writers of Research Papers, Theses, and Dissertations, created from her years of experience with research projects across all fields, has sold more than seven million copies since it was first published in 1937.Now, with this seventh edition, Turabian’s Manual has undergone its most extensive revision, ensuring that it will remain the most valuable handbook for writers at every level—from first-year undergraduates, to dissertation writers apprehensively submitting final manuscripts, to senior scholars who may be old hands at research and writing but less familiar with new media citation styles. Gregory G. Colomb, Joseph M. Williams, and the late Wayne C. Booth—the gifted team behind The Craft of Research—and the University of Chicago Press Editorial Staff combined their wide-ranging expertise to remake this classic resource. They preserve Turabian’s clear and practical advice while fully embracing the new modes of research, writing, and source citation brought about by the age of the Internet.Booth, Colomb, and Williams significantly expand the scope of previous editions by creating a guide, generous in length and tone, to the art of research and writing. Growing out of the authors’ best-selling Craft of Research, this new section provides students with an overview of every step of the research and writing process, from formulating the right questions to reading critically to building arguments and revising drafts. This leads naturally to the second part of the Manual for Writers, which offers an authoritative overview of citation practices in scholarly writing, as well as detailed information on the two main citation styles (“notes-bibliography” and “author-date”). This section has been fully revised to reflect the recommendations of the fifteenth edition of The Chicago Manual of Style and to present an expanded array of source types and updated examples, including guidance on citing electronic sources.The final section of the book treats issues of style—the details that go into making a strong paper. Here writers will find advice on a wide range of topics, including punctuation, table formatting, and use of quotations. The appendix draws together everything writers need to know about formatting research papers, theses, and dissertations and preparing them for submission. This material has been thoroughly vetted by dissertation officials at colleges and universities across the country.This seventh edition of Turabian’s Manual for Writers of Research Papers, Theses, and Dissertations is a classic reference revised for a new age. It is tailored to a new generation of writers using tools its original author could not have imagined—while retaining the clarity and authority that generations of scholars have come to associate with the name Turabian.

Big Data: A Revolution That Will Transform How We Live, Work, and Think


Viktor Mayer-Schönberger - 2013
    “Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com

Python Pocket Reference


Mark Lutz - 1998
    Hundreds of thousands of Python developers around the world rely on Python for general-purpose tasks, Internet scripting, systems programming, user interfaces, and product customization. Available on all major computing platforms, including commercial versions of Unix, Linux, Windows, and Mac OS X, Python is portable, powerful and remarkable easy to use.With its convenient, quick-reference format, "Python Pocket Reference," 3rd Edition is the perfect on-the-job reference. More importantly, it's now been refreshed to cover the language's latest release, Python 2.4. For experienced Python developers, this book is a compact toolbox that delivers need-to-know information at the flip of a page. This third edition also includes an easy-lookup index to help developers find answers fast!Python 2.4 is more than just optimization and library enhancements; it's also chock full of bug fixes and upgrades. And these changes are addressed in the "Python Pocket Reference," 3rd Edition. New language features, new and upgraded built-ins, and new and upgraded modules and packages--they're all clarified in detail.The "Python Pocket Reference," 3rd Edition serves as the perfect companion to "Learning Python" and "Programming Python."

The Functional Art: An Introduction to Information Graphics and Visualization


Alberto Cairo - 2011
    With the right tools, we can start to make sense of all this data to see patterns and trends that would otherwise be invisible to us. By transforming numbers into graphical shapes, we allow readers to understand the stories those numbers hide. In this practical introduction to understanding and using information graphics, you'll learn how to use data visualizations as tools to see beyond lists of numbers and variables and achieve new insights into the complex world around us. Regardless of the kind of data you're working with-business, science, politics, sports, or even your own personal finances-this book will show you how to use statistical charts, maps, and explanation diagrams to spot the stories in the data and learn new things from it.You'll also get to peek into the creative process of some of the world's most talented designers and visual journalists, including Conde Nast Traveler's John Grimwade, National Geographic Magazine's Fernando Baptista, The New York Times' Steve Duenes, The Washington Post's Hannah Fairfield, Hans Rosling of the Gapminder Foundation, Stanford's Geoff McGhee, and European superstars Moritz Stefaner, Jan Willem Tulp, Stefanie Posavec, and Gregor Aisch. The book also includes a DVD-ROM containing over 90 minutes of video lessons that expand on core concepts explained within the book and includes even more inspirational information graphics from the world's leading designers.The first book to offer a broad, hands-on introduction to information graphics and visualization, The Functional Art reveals:- Why data visualization should be thought of as "functional art" rather than fine art - How to use color, type, and other graphic tools to make your information graphics more effective, not just better looking - The science of how our brains perceive and remember information - Best practices for creating interactive information graphics - A comprehensive look at the creative process behind successful information graphics - An extensive gallery of inspirational work from the world's top designers and visual artistsOn the DVD-ROM: In this introductory video course on information graphics, Alberto Cairo goes into greater detail with even more visual examples of how to create effective information graphics that function as practical tools for aiding perception. You'll learn how to: incorporate basic design principles in your visualizations, create simple interfaces for interactive graphics, and choose the appropriate type of graphic forms for your data. Cairo also deconstructs successful information graphics from The New York Times and National Geographic magazine with sketches and images not shown in the book.

How College Affects Students: Volume 2 - A Third Decade of Research


Ernest T. Pascarella - 2005
    The authors review their earlier findings and then synthesize what has been learned since 1990 about college's influences on students' learning. The book also discusses the implications of the findings for research, practice, and public policy. This authoritative and comprehensive analysis of the literature on college-impact is required reading for anyone interested in higher education practice, policy, and promise3/4faculty, administrators, researchers, policy analysts, and decision-makers at every level.

R in a Nutshell: A Desktop Quick Reference


Joseph Adler - 2009
    R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics.The author introduces you to the R environment, including the R graphical user interface and console, and takes you through the fundamentals of the object-oriented R language. Then, through a variety of practical examples from medicine, business, and sports, you'll learn how you can use this remarkable tool to solve your own data analysis problems.Understand the basics of the language, including the nature of R objectsLearn how to write R functions and build your own packagesWork with data through visualization, statistical analysis, and other methodsExplore the wealth of packages contributed by the R communityBecome familiar with the lattice graphics package for high-level data visualizationLearn about bioinformatics packages provided by Bioconductor"I am excited about this book. R in a Nutshell is a great introduction to R, as well as a comprehensive reference for using R in data analytics and visualization. Adler provides 'real world' examples, practical advice, and scripts, making it accessible to anyone working with data, not just professional statisticians."

Predictive Analytics for Dummies


Anasse Bari - 2013
    Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions.Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more.Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.

Concrete Mathematics: A Foundation for Computer Science


Ronald L. Graham - 1988
    "More concretely," the authors explain, "it is the controlled manipulation of mathematical formulas, using a collection of techniques for solving problems."

The Visual Display of Quantitative Information


Edward R. Tufte - 1983
    Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences


Jacob Cohen - 1975
    Readers profit from its verbal-conceptual exposition and frequent use of examples.The applied emphasis provides clear illustrations of the principles and provides worked examples of the types of applications that are possible. Researchers learn how to specify regression models that directly address their research questions. An overview of the fundamental ideas of multiple regression and a review of bivariate correlation and regression and other elementary statistical concepts provide a strong foundation for understanding the rest of the text. The third edition features an increased emphasis on graphics and the use of confidence intervals and effect size measures, and an accompanying website with data for most of the numerical examples along with the computer code for SPSS, SAS, and SYSTAT, at www.psypress.com/9780805822236 .Applied Multiple Regression serves as both a textbook for graduate students and as a reference tool for researchers in psychology, education, health sciences, communications, business, sociology, political science, anthropology, and economics. An introductory knowledge of statistics is required. Self-standing chapters minimize the need for researchers to refer to previous chapters.

Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions


Michael G. Milton - 2009
    If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to:Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Beginning Programming with Python for Dummies


John Paul Mueller - 2014
    It requires three to five times less time than developing in Java, is a great building block for learning both procedural and object-oriented programming concepts, and is an ideal language for data analysis. Beginning Programming with Python For Dummies is the perfect guide to this dynamic and powerful programming language--even if you've never coded before! Author John Paul Mueller draws on his vast programming knowledge and experience to guide you step-by-step through the syntax and logic of programming with Python and provides several real-world programming examples to give you hands-on experience trying out what you've learned.Provides a solid understanding of basic computer programming concepts and helps familiarize you with syntax and logic Explains the fundamentals of procedural and object-oriented programming Shows how Python is being used for data analysis and other applications Includes short, practical programming samples to apply your skills to real-world programming scenarios Whether you've never written a line of code or are just trying to pick up Python, there's nothing to fear with the fun and friendly Beginning Programming with Python For Dummies leading the way.