Book picks similar to
Statistics in Language Studies by Anthony Woods
linguistics
statistics
slush
ww-university-courses
The Alignment Problem: Machine Learning and Human Values
Brian Christian - 2020
Today’s "machine-learning" systems, trained by data, are so effective that we’ve invited them to see and hear for us?and to make decisions on our behalf. But alarm bells are ringing. Recent years have seen an eruption of concern as the field of machine learning advances. When the systems we attempt to teach will not, in the end, do what we want or what we expect, ethical and potentially existential risks emerge. Researchers call this the alignment problem.Systems cull résumés until, years later, we discover that they have inherent gender biases. Algorithms decide bail and parole?and appear to assess Black and White defendants differently. We can no longer assume that our mortgage application, or even our medical tests, will be seen by human eyes. And as autonomous vehicles share our streets, we are increasingly putting our lives in their hands.The mathematical and computational models driving these changes range in complexity from something that can fit on a spreadsheet to a complex system that might credibly be called “artificial intelligence.” They are steadily replacing both human judgment and explicitly programmed software.In best-selling author Brian Christian’s riveting account, we meet the alignment problem’s “first-responders,” and learn their ambitious plan to solve it before our hands are completely off the wheel. In a masterful blend of history and on-the ground reporting, Christian traces the explosive growth in the field of machine learning and surveys its current, sprawling frontier. Readers encounter a discipline finding its legs amid exhilarating and sometimes terrifying progress. Whether they—and we—succeed or fail in solving the alignment problem will be a defining human story.The Alignment Problem offers an unflinching reckoning with humanity’s biases and blind spots, our own unstated assumptions and often contradictory goals. A dazzlingly interdisciplinary work, it takes a hard look not only at our technology but at our culture—and finds a story by turns harrowing and hopeful.
Word Myths: Debunking Linguistic Urban Legends
David Wilton - 2004
David Wilton debunks the most persistently wrong word histories, and gives, to the best of our actual knowledge, the real stories behind these perennially mis-etymologized words. In addition, he explains why these wrong stories are created, disseminated, and persist, even after being corrected time and time again. What makes us cling to these stories, when the truth behind these words and phrases is available, for the most part, at any library or on the Internet? Arranged by chapters, this book avoids a dry A-Z format. Chapters separate misetymologies by kind, including The Perils of Political Correctness (picnics have nothing to do with lynchings), Posh, Phat Pommies (the problems of bacronyming--the desire to make every word into an acronym), and CANOE (which stands for the Conspiracy to Attribute Nautical Origins to Everything). Word Myths corrects long-held and far-flung examples of wrong etymologies, without taking the fun out of etymology itself. It's the best of both worlds: not only do you learn the many wrong stories behind these words, you also learn why and how they are created--and what the real story is.
Python Data Science Handbook: Tools and Techniques for Developers
Jake Vanderplas - 2016
Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.With this handbook, you’ll learn how to use: * IPython and Jupyter: provide computational environments for data scientists using Python * NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python * Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python * Matplotlib: includes capabilities for a flexible range of data visualizations in Python * Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms
Machine Learning with R
Brett Lantz - 2014
This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating the results. These steps will build the knowledge you need to apply them to your own data science tasks.Intended for those who want to learn how to use R's machine learning capabilities and gain insight from your data. Perhaps you already know a bit about machine learning, but have never used R; or perhaps you know a little R but are new to machine learning. In either case, this book will get you up and running quickly. It would be helpful to have a bit of familiarity with basic programming concepts, but no prior experience is required.
Information Dashboard Design: The Effective Visual Communication of Data
Stephen Few - 2006
Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.
Data Visualization: A Practical Introduction
Kieran Healy - 2018
It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way.Data Visualization builds the reader's expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective "small multiple" plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible.Effective graphics are essential to communicating ideas and a great way to better understand data. This book provides the practical skills students and practitioners need to visualize quantitative data and get the most out of their research findings.Provides hands-on instruction using R and ggplot2Shows how the "tidyverse" of data analysis tools makes working with R easier and more consistentIncludes a library of data sets, code, and functions
The Gold Bat
P.G. Wodehouse - 1904
Letters appeared in every second number of the Wrykinian, some short, others long, some from members of the school, others from Old Boys, all protesting against the condition of the first, second, and third fifteen dressing-rooms. 'Indignant" would inquire acidly, in half a page of small type, if the editor happened to be aware that there was no hair-brush in the second room, and only half a comb. 'Disgusted O. W." would remark that when he came down with the Wandering Zephyrs to play against the third fifteen, the water supply had suddenly and mysteriously failed, and the W.Z.'s had been obliged to go home as they were, in a state of primeval grime, and he thought that this was 'a very bad thing in a school of over six hundred boys," though what the number of boys had to do with the fact that there was no water he omitted to explain. The editor would express his regret in brackets, and things would go on as before.
Machine Learning in Action
Peter Harrington - 2011
"Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.
Forecasting: Principles and Practice
Rob J. Hyndman - 2013
Deciding whether to build another power generation plant in the next five years requires forecasts of future demand. Scheduling staff in a call centre next week requires forecasts of call volumes. Stocking an inventory requires forecasts of stock requirements. Telecommunication routing requires traffic forecasts a few minutes ahead. Whatever the circumstances or time horizons involved, forecasting is an important aid in effective and efficient planning. This textbook provides a comprehensive introduction to forecasting methods and presents enough information about each method for readers to use them sensibly. Examples use R with many data sets taken from the authors' own consulting experience.
Introductory Graph Theory
Gary Chartrand - 1984
Introductory Graph Theory presents a nontechnical introduction to this exciting field in a clear, lively, and informative style. Author Gary Chartrand covers the important elementary topics of graph theory and its applications. In addition, he presents a large variety of proofs designed to strengthen mathematical techniques and offers challenging opportunities to have fun with mathematics. Ten major topics — profusely illustrated — include: Mathematical Models, Elementary Concepts of Graph Theory, Transportation Problems, Connection Problems, Party Problems, Digraphs and Mathematical Models, Games and Puzzles, Graphs and Social Psychology, Planar Graphs and Coloring Problems, and Graphs and Other Mathematics. A useful Appendix covers Sets, Relations, Functions, and Proofs, and a section devoted to exercises — with answers, hints, and solutions — is especially valuable to anyone encountering graph theory for the first time. Undergraduate mathematics students at every level, puzzlists, and mathematical hobbyists will find well-organized coverage of the fundamentals of graph theory in this highly readable and thoroughly enjoyable book.
Programming in Haskell
Graham Hutton - 2006
This introduction is ideal for beginners: it requires no previous programming experience and all concepts are explained from first principles via carefully chosen examples. Each chapter includes exercises that range from the straightforward to extended projects, plus suggestions for further reading on more advanced topics. The author is a leading Haskell researcher and instructor, well-known for his teaching skills. The presentation is clear and simple, and benefits from having been refined and class-tested over several years. The result is a text that can be used with courses, or for self-learning. Features include freely accessible Powerpoint slides for each chapter, solutions to exercises and examination questions (with solutions) available to instructors, and a downloadable code that's fully compliant with the latest Haskell release.
Pattern Classification
David G. Stork - 1973
Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics.An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.
Visualize This: The FlowingData Guide to Design, Visualization, and Statistics
Nathan Yau - 2011
Wouldn't it be wonderful if we could actually visualize data in such a way that we could maximize its potential and tell a story in a clear, concise manner? Thanks to the creative genius of Nathan Yau, we can. With this full-color book, data visualization guru and author Nathan Yau uses step-by-step tutorials to show you how to visualize and tell stories with data. He explains how to gather, parse, and format data and then design high quality graphics that help you explore and present patterns, outliers, and relationships.Presents a unique approach to visualizing and telling stories with data, from a data visualization expert and the creator of flowingdata.com, Nathan Yau Offers step-by-step tutorials and practical design tips for creating statistical graphics, geographical maps, and information design to find meaning in the numbers Details tools that can be used to visualize data-native graphics for the Web, such as ActionScript, Flash libraries, PHP, and JavaScript and tools to design graphics for print, such as R and Illustrator Contains numerous examples and descriptions of patterns and outliers and explains how to show them Visualize This demonstrates how to explain data visually so that you can present your information in a way that is easy to understand and appealing.
Machine Learning for Hackers
Drew Conway - 2012
Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyze sample datasets and write simple machine learning algorithms. "Machine Learning for Hackers" is ideal for programmers from any background, including business, government, and academic research.Develop a naive Bayesian classifier to determine if an email is spam, based only on its textUse linear regression to predict the number of page views for the top 1,000 websitesLearn optimization techniques by attempting to break a simple letter cipherCompare and contrast U.S. Senators statistically, based on their voting recordsBuild a "whom to follow" recommendation system from Twitter data
Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures
Claus O. Wilke - 2019
But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.Explore the basic concepts of color as a tool to highlight, distinguish, or represent a valueUnderstand the importance of redundant coding to ensure you provide key information in multiple waysUse the book's visualizations directory, a graphical guide to commonly used types of data visualizationsGet extensive examples of good and bad figuresLearn how to use figures in a document or report and how employ them effectively to tell a compelling story