Book picks similar to
Approaching (Almost) Any Machine Learning Problem by Abhishek Thakur
machine-learning
data-science
artificial-intelligence
ml
Probabilistic Graphical Models: Principles and Techniques
Daphne Koller - 2009
The framework of probabilistic graphical models, presented in this book, provides a general approach for this task. The approach is model-based, allowing interpretable models to be constructed and then manipulated by reasoning algorithms. These models can also be learned automatically from data, allowing the approach to be used in cases where manually constructing a model is difficult or even impossible. Because uncertainty is an inescapable aspect of most real-world applications, the book focuses on probabilistic models, which make the uncertainty explicit and provide models that are more faithful to reality.Probabilistic Graphical Models discusses a variety of models, spanning Bayesian networks, undirected Markov networks, discrete and continuous models, and extensions to deal with dynamical systems and relational data. For each class of models, the text describes the three fundamental cornerstones: representation, inference, and learning, presenting both basic concepts and advanced techniques. Finally, the book considers the use of the proposed framework for causal reasoning and decision making under uncertainty. The main text in each chapter provides the detailed technical development of the key ideas. Most chapters also include boxes with additional material: skill boxes, which describe techniques; case study boxes, which discuss empirical cases related to the approach described in the text, including applications in computer vision, robotics, natural language understanding, and computational biology; and concept boxes, which present significant concepts drawn from the material in the chapter. Instructors (and readers) can group chapters in various combinations, from core topics to more technically advanced material, to suit their particular needs.
R for Everyone: Advanced Analytics and Graphics
Jared P. Lander - 2013
R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES - Exploring R, RStudio, and R packages - Using R for math: variable types, vectors, calling functions, and more - Exploiting data structures, including data.frames, matrices, and lists - Creating attractive, intuitive statistical graphics - Writing user-defined functions - Controlling program flow with if, ifelse, and complex checks - Improving program efficiency with group manipulations - Combining and reshaping multiple datasets - Manipulating strings using R's facilities and regular expressions - Creating normal, binomial, and Poisson probability distributions - Programming basic statistics: mean, standard deviation, and t-tests - Building linear, generalized linear, and nonlinear models - Assessing the quality of models and variable selection - Preventing overfitting, using the Elastic Net and Bayesian methods - Analyzing univariate and multivariate time series data - Grouping data via K-means and hierarchical clustering - Preparing reports, slideshows, and web pages with knitr - Building reusable R packages with devtools and Rcpp - Getting involved with the R global community
All of Statistics: A Concise Course in Statistical Inference
Larry Wasserman - 2003
But in spirit, the title is apt, as the book does cover a much broader range of topics than a typical introductory book on mathematical statistics. This book is for people who want to learn probability and statistics quickly. It is suitable for graduate or advanced undergraduate students in computer science, mathematics, statistics, and related disciplines. The book includes modern topics like nonparametric curve estimation, bootstrapping, and clas- sification, topics that are usually relegated to follow-up courses. The reader is presumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. Statistics, data mining, and machine learning are all concerned with collecting and analyzing data. For some time, statistics research was con- ducted in statistics departments while data mining and machine learning re- search was conducted in computer science departments. Statisticians thought that computer scientists were reinventing the wheel. Computer scientists thought that statistical theory didn't apply to their problems. Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algo- rithms are more scalable than statisticians ever thought possible. Formal sta- tistical theory is more pervasive than computer scientists had realized.
The Linux Programming Interface: A Linux and Unix System Programming Handbook
Michael Kerrisk - 2010
You'll learn how to:Read and write files efficiently Use signals, clocks, and timers Create processes and execute programs Write secure programs Write multithreaded programs using POSIX threads Build and use shared libraries Perform interprocess communication using pipes, message queues, shared memory, and semaphores Write network applications with the sockets API While The Linux Programming Interface covers a wealth of Linux-specific features, including epoll, inotify, and the /proc file system, its emphasis on UNIX standards (POSIX.1-2001/SUSv3 and POSIX.1-2008/SUSv4) makes it equally valuable to programmers working on other UNIX platforms.The Linux Programming Interface is the most comprehensive single-volume work on the Linux and UNIX programming interface, and a book that's destined to become a new classic.Praise for The Linux Programming Interface "If I had to choose a single book to sit next to my machine when writing software for Linux, this would be it." —Martin Landers, Software Engineer, Google "This book, with its detailed descriptions and examples, contains everything you need to understand the details and nuances of the low-level programming APIs in Linux . . . no matter what the level of reader, there will be something to be learnt from this book." —Mel Gorman, Author of Understanding the Linux Virtual Memory Manager "Michael Kerrisk has not only written a great book about Linux programming and how it relates to various standards, but has also taken care that bugs he noticed got fixed and the man pages were (greatly) improved. In all three ways, he has made Linux programming easier. The in-depth treatment of topics in The Linux Programming Interface . . . makes it a must-have reference for both new and experienced Linux programmers." —Andreas Jaeger, Program Manager, openSUSE, Novell "Michael's inexhaustible determination to get his information right, and to express it clearly and concisely, has resulted in a strong reference source for programmers. While this work is targeted at Linux programmers, it will be of value to any programmer working in the UNIX/POSIX ecosystem." —David Butenhof, Author of Programming with POSIX Threads and Contributor to the POSIX and UNIX Standards ". . . a very thorough—yet easy to read—explanation of UNIX system and network programming, with an emphasis on Linux systems. It's certainly a book I'd recommend to anybody wanting to get into UNIX programming (in general) or to experienced UNIX programmers wanting to know 'what's new' in the popular GNU/Linux system." —Fernando Gont, Network Security Researcher, IETF Participant, and RFC Author ". . . encyclopedic in the breadth and depth of its coverage, and textbook-like in its wealth of worked examples and exercises. Each topic is clearly and comprehensively covered, from theory to hands-on working code. Professionals, students, educators, this is the Linux/UNIX reference that you have been waiting for." —Anthony Robins, Associate Professor of Computer Science, The University of Otago "I've been very impressed by the precision, the quality and the level of detail Michael Kerrisk put in his book. He is a great expert of Linux system calls and lets us share his knowledge and understanding of the Linux APIs." —Christophe Blaess, Author of Programmation systeme en C sous Linux ". . . an essential resource for the serious or professional Linux and UNIX systems programmer. Michael Kerrisk covers the use of all the key APIs across both the Linux and UNIX system interfaces with clear descriptions and tutorial examples and stresses the importance and benefits of following standards such as the Single UNIX Specification and POSIX 1003.1." —Andrew Josey, Director, Standards, The Open Group, and Chair of the POSIX 1003.1 Working Group "What could be better than an encyclopedic reference to the Linux system, from the standpoint of the system programmer, written by none other than the maintainer of the man pages himself? The Linux Programming Interface is comprehensive and detailed. I firmly expect it to become an indispensable addition to my programming bookshelf." —Bill Gallmeister, Author of POSIX.4 Programmer's Guide: Programming for the Real World ". . . the most complete and up-to-date book about Linux and UNIX system programming. If you're new to Linux system programming, if you're a UNIX veteran focused on portability while interested in learning the Linux way, or if you're simply looking for an excellent reference about the Linux programming interface, then Michael Kerrisk's book is definitely the companion you want on your bookshelf." —Loic Domaigne, Chief Software Architect (Embedded), Corpuls.com
The Ethical Algorithm: The Science of Socially Aware Algorithm Design
Michael Kearns - 2019
Algorithms have made our lives more efficient, more entertaining, and, sometimes, better informed. At the same time, complex algorithms are increasingly violating the basic rights of individual citizens. Allegedly anonymized datasets routinely leak our most sensitive personal information; statistical models for everything from mortgages to college admissions reflect racial and gender bias. Meanwhile, users manipulate algorithms to "game" search engines, spam filters, online reviewing services, and navigation apps.Understanding and improving the science behind the algorithms that run our lives is rapidly becoming one of the most pressing issues of this century. Traditional fixes, such as laws, regulations and watchdog groups, have proven woefully inadequate. Reporting from the cutting edge of scientific research, The Ethical Algorithm offers a new approach: a set of principled solutions based on the emerging and exciting science of socially aware algorithm design. Michael Kearns and Aaron Roth explain how we can better embed human principles into machine code - without halting the advance of data-driven scientific exploration. Weaving together innovative research with stories of citizens, scientists, and activists on the front lines, The Ethical Algorithm offers a compelling vision for a future, one in which we can better protect humans from the unintended impacts of algorithms while continuing to inspire wondrous advances in technology.
The Fourth Paradigm: Data-Intensive Scientific Discovery
Tony Hey - 2009
Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.
Pattern Classification
David G. Stork - 1973
Now with the second edition, readers will find information on key new topics such as neural networks and statistical pattern recognition, the theory of machine learning, and the theory of invariances. Also included are worked examples, comparisons between different methods, extensive graphics, expanded exercises and computer project topics.An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Ralph Kimball - 1996
Here is a complete library of dimensional modeling techniques-- the most comprehensive collection ever written. Greatly expanded to cover both basic and advanced techniques for optimizing data warehouse design, this second edition to Ralph Kimball's classic guide is more than sixty percent updated.The authors begin with fundamental design recommendations and gradually progress step-by-step through increasingly complex scenarios. Clear-cut guidelines for designing dimensional models are illustrated using real-world data warehouse case studies drawn from a variety of business application areas and industries, including:* Retail sales and e-commerce* Inventory management* Procurement* Order management* Customer relationship management (CRM)* Human resources management* Accounting* Financial services* Telecommunications and utilities* Education* Transportation* Health care and insuranceBy the end of the book, you will have mastered the full range of powerful techniques for designing dimensional databases that are easy to understand and provide fast query response. You will also learn how to create an architected framework that integrates the distributed data warehouse using standardized dimensions and facts.This book is also available as part of the Kimball's Data Warehouse Toolkit Classics Box Set (ISBN: 9780470479575) with the following 3 books:The Data Warehouse Toolkit, 2nd Edition (9780471200246)The Data Warehouse Lifecycle Toolkit, 2nd Edition (9780470149775)The Data Warehouse ETL Toolkit (9780764567575)
The Dream Machine: J.C.R. Licklider and the Revolution That Made Computing Personal
M. Mitchell Waldrop - 2001
C. R. Licklider, whose visionary dream of a human-computer symbiosis transformed the course of modern science and led to the development of the personal computer. Reprint.
Dark Pools: The Rise of Artificially Intelligent Trading Machines and the Looming Threat to Wall Street
Scott Patterson - 2012
In the beginning was Josh Levine, an idealistic programming genius who dreamed of wresting control of the market from the big exchanges that, again and again, gave the giant institutions an advantage over the little guy. Levine created a computerized trading hub named Island where small traders swapped stocks, and over time his invention morphed into a global electronic stock market that sent trillions in capital through a vast jungle of fiber-optic cables. By then, the market that Levine had sought to fix had turned upside down, birthing secretive exchanges called dark pools and a new species of trading machines that could think, and that seemed, ominously, to be slipping the control of their human masters. Dark Pools is the fascinating story of how global markets have been hijacked by trading robots--many so self-directed that humans can't predict what they'll do next.
Apprenticeship Patterns: Guidance for the Aspiring Software Craftsman
Dave Hoover - 2009
To grow professionally, you also need soft skills and effective learning techniques. Honing those skills is what this book is all about. Authors Dave Hoover and Adewale Oshineye have cataloged dozens of behavior patterns to help you perfect essential aspects of your craft. Compiled from years of research, many interviews, and feedback from O'Reilly's online forum, these patterns address difficult situations that programmers, administrators, and DBAs face every day. And it's not just about financial success. Apprenticeship Patterns also approaches software development as a means to personal fulfillment. Discover how this book can help you make the best of both your life and your career. Solutions to some common obstacles that this book explores in-depth include:Burned out at work? "Nurture Your Passion" by finding a pet project to rediscover the joy of problem solving.Feeling overwhelmed by new information? Re-explore familiar territory by building something you've built before, then use "Retreat into Competence" to move forward again.Stuck in your learning? Seek a team of experienced and talented developers with whom you can "Be the Worst" for a while. "Brilliant stuff! Reading this book was like being in a time machine that pulled me back to those key learning moments in my career as a professional software developer and, instead of having to learn best practices the hard way, I had a guru sitting on my shoulder guiding me every step towards master craftsmanship. I'll certainly be recommending this book to clients. I wish I had this book 14 years ago!" -Russ Miles, CEO, OpenCredo
Advances in Financial Machine Learning
Marcos López de Prado - 2018
Today, ML algorithms accomplish tasks that - until recently - only expert humans could perform. And finance is ripe for disruptive innovations that will transform how the following generations understand money and invest.In the book, readers will learn how to:Structure big data in a way that is amenable to ML algorithms Conduct research with ML algorithms on big data Use supercomputing methods and back test their discoveries while avoiding false positives Advances in Financial Machine Learning addresses real life problems faced by practitioners every day, and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their individual setting.Written by a recognized expert and portfolio manager, this book will equip investment professionals with the groundbreaking tools needed to succeed in modern finance.
Big Data for Dummies
Judith Hurwitz - 2013
Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work.Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionals Authors are experts in information management, big data, and a variety of solutions Explains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much more Provides essential information in a no-nonsense, easy-to-understand style that is empowering Big Data For Dummies cuts through the confusion and helps you take charge of big data solutions for your organization.
Refactoring to Patterns
Joshua Kerievsky - 2004
In 1999, "Refactoring" revolutionized design by introducing an effective process for improving code. With the highly anticipated " Refactoring to Patterns ," Joshua Kerievsky has changed our approach to design by forever uniting patterns with the evolutionary process of refactoring.This book introduces the theory and practice of pattern-directed refactorings: sequences of low-level refactorings that allow designers to safely move designs to, towards, or away from pattern implementations. Using code from real-world projects, Kerievsky documents the thinking and steps underlying over two dozen pattern-based design transformations. Along the way he offers insights into pattern differences and how to implement patterns in the simplest possible ways.Coverage includes: A catalog of twenty-seven pattern-directed refactorings, featuring real-world code examples Descriptions of twelve design smells that indicate the need for this book s refactorings General information and new insights about patterns and refactoringDetailed implementation mechanics: how low-level refactorings are combined to implement high-level patterns Multiple ways to implement the same pattern and when to use each Practical ways to get started even if you have little experience with patterns or refactoring"Refactoring to Patterns" reflects three years of refinement and the insights of more than sixty software engineering thought leaders in the global patterns, refactoring, and agile development communities. Whether you re focused on legacy or greenfield development, this book will make you a better software designer by helping you learn how to make important design changes safely and effectively. "
The Art of Computer Programming, Volumes 1-4a Boxed Set
Donald Ervin Knuth - 2011
Scientists have marveled at the beauty and elegance of his analysis, while ordinary programmers have successfully applied his "cookbook" solutions to their day-to-day problems. All have admired Knuth for the breadth, clarity, accuracy, and good humor found in his books. "I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up.""--"Charles Long Primarily written as a reference, some people have nevertheless found it possible and interesting to read each volume from beginning to end. A programmer in China even compared the experience to reading a poem. "If you think you're a really good programmer... read Knuth's] "Art of Computer Programming.".. You should definitely send me a resume if you can read the whole thing.""--"Bill Gates Whatever your background, if you need to do any serious computer programming, you will find your own good reason to make each volume in this series a readily accessible part of your scholarly or professional library. "It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers.""--"Jonathan LaventholIn describing the new fourth volume, one reviewer listed the qualities that distinguish all of Knuth's work. In sum: ] "detailed coverage of the basics, illustrated with well-chosen examples; occasional forays into more esoteric topics and problems at the frontiers of research; impeccable writing peppered with occasional bits of humor; extensive collections of exercises, all with solutions or helpful hints; a careful attention to history; implementations of many of the algorithms in his classic step-by-step form."--Frank RuskeyThese four books comprise what easily could be the most important set of information on any serious programmer's bookshelf.