Book picks similar to
Pattern Recognition and Classification: An Introduction by Geoff Dougherty
statistics
کنجکاوی
computer-science
machine-learning
Spark: The Definitive Guide: Big Data Processing Made Simple
Bill Chambers - 2018
With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals.
You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library.
Get a gentle overview of big data and Spark
Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples
Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames
Understand how Spark runs on a cluster
Debug, monitor, and tune Spark clusters and applications
Learn the power of Structured Streaming, Spark’s stream-processing engine
Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Discovering Statistics Using R
Andy Field - 2012
Like its sister textbook, Discovering Statistics Using R is written in an irreverent style and follows the same ground-breaking structure and pedagogical approach. The core material is enhanced by a cast of characters to help the reader on their way, hundreds of examples, self-assessment tests to consolidate knowledge, and additional website material for those wanting to learn more.
Numerical Optimization
Jorge Nocedal - 2000
One can trace its roots to the Calculus of Variations and the work of Euler and Lagrange. This natural and reasonable approach to mathematical programming covers numerical methods for finite-dimensional optimization problems. It begins with very simple ideas progressing through more complicated concepts, concentrating on methods for both unconstrained and constrained optimization.
High Performance Python: Practical Performant Programming for Humans
Micha Gorelick - 2013
Updated for Python 3, this expanded edition shows you how to locate performance bottlenecks and significantly speed up your code in high-data-volume programs. By exploring the fundamental theory behind design choices, High Performance Python helps you gain a deeper understanding of Python's implementation.How do you take advantage of multicore architectures or clusters? Or build a system that scales up and down without losing reliability? Experienced Python programmers will learn concrete solutions to many issues, along with war stories from companies that use high-performance Python for social media analytics, productionized machine learning, and more.Get a better grasp of NumPy, Cython, and profilersLearn how Python abstracts the underlying computer architectureUse profiling to find bottlenecks in CPU time and memory usageWrite efficient programs by choosing appropriate data structuresSpeed up matrix and vector computationsUse tools to compile Python down to machine codeManage multiple I/O and computational operations concurrentlyConvert multiprocessing code to run on local or remote clustersDeploy code faster using tools like Docker
SEO 2016: Learn Search Engine Optimization (SEO Books Series)
R.L. Adams - 2015
It's certainly no walk in the park. And, depending on where you've been for your information when it comes to SEO, it might be outdated, or just flat-out wrong. Why is that? Search has been evolving at an uncanny rate in recent years. And, if you're not in the know, then you could end up spinning your wheels and wasting valuable and precious time and resources on techniques that no longer work. The main reason for the recent changes: to increase relevancy. Google's sole mission is to provide the most relevant search results at the top of its searches, in the quickest manner possible. But, in recent years, due to some mischievous behavior at the hand of a small group of people, relevancy began to wane. SEO 2016 :: Understanding Google's Algorithm Adjustments The field of SEO has been changing, all led by Google's onslaught of algorithm adjustments that have decimated and razed some sites while uplifting and building others. Since 2011, Google has made it its mission to hunt out and demote spammy sites that sacrifice user-experience, focus on thin content, or simply spend their time trying to trick and deceive their way to the top of its search results. At the same time, Google has increased its reliance on four major components of trust, that work at the heart of its search algorithm: Trust in Age Trust in Authority Trust in Content Relevancy In this book, you'll learn just how each of these affects Google's search results, and just how you can best optimize your site and content to ensure that you're playing by Google's many rules. And, although there have been many algorithm adjustments over the years, four major ones have shaped and forever changed the search engine landscape: Google Panda Google Penguin Google Hummingbird Google Mobilegeddon We'll discuss the nature of these changes and just how each of these algorithm adjustments have shaped the current landscape in search engine optimization. So what does it take to rank your site today? In order to compete at any level in SEO, you have to earn trust - Google's trust that is. But, what does that take? How can we build trust quickly without jumping through all the hoops? SEO is by no means a small feat. It takes hard work applied consistently overtime. There are no overnight success stories when it comes to SEO. But there are certainly ways to navigate the stormy online waters of Google's highly competitive search. Download SEO 2016 :: Learn Search Engine Optimization Lift the veil on Google's complex search algorithm, and understand just what it takes to rank on Google searches today, not yesterday.
Social and Economic Networks
Matthew O. Jackson - 2008
The many aspects of our lives that are governed by social networks make it critical to understand how they impact behavior, which network structures are likely to emerge in a society, and why we organize ourselves as we do. In Social and Economic Networks, Matthew Jackson offers a comprehensive introduction to social and economic networks, drawing on the latest findings in economics, sociology, computer science, physics, and mathematics. He provides empirical background on networks and the regularities that they exhibit, and discusses random graph-based models and strategic models of network formation. He helps readers to understand behavior in networked societies, with a detailed analysis of learning and diffusion in networks, decision making by individuals who are influenced by their social neighbors, game theory and markets on networks, and a host of related subjects. Jackson also describes the varied statistical and modeling techniques used to analyze social networks. Each chapter includes exercises to aid students in their analysis of how networks function.This book is an indispensable resource for students and researchers in economics, mathematics, physics, sociology, and business.
Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work
Q. Ethan McCallum - 2012
In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis
The Black Box Society: The Secret Algorithms That Control Money and Information
Frank Pasquale - 2014
The data compiled and portraits created are incredibly detailed, to the point of being invasive. But who connects the dots about what firms are doing with this information? The Black Box Society argues that we all need to be able to do so--and to set limits on how big data affects our lives.Hidden algorithms can make (or ruin) reputations, decide the destiny of entrepreneurs, or even devastate an entire economy. Shrouded in secrecy and complexity, decisions at major Silicon Valley and Wall Street firms were long assumed to be neutral and technical. But leaks, whistleblowers, and legal disputes have shed new light on automated judgment. Self-serving and reckless behavior is surprisingly common, and easy to hide in code protected by legal and real secrecy. Even after billions of dollars of fines have been levied, underfunded regulators may have only scratched the surface of this troubling behavior.Frank Pasquale exposes how powerful interests abuse secrecy for profit and explains ways to rein them in. Demanding transparency is only the first step. An intelligible society would assure that key decisions of its most important firms are fair, nondiscriminatory, and open to criticism. Silicon Valley and Wall Street need to accept as much accountability as they impose on others.
Regular Expression Pocket Reference: Regular Expressions for Perl, Ruby, PHP, Python, C, Java and .NET
Tony Stubblebine - 2007
Ideal as a quick reference, Regular Expression Pocket Reference covers the regular expression APIs for Perl 5.8, Ruby (including some upcoming 1.9 features), Java, PHP, .NET and C#, Python, vi, JavaScript, and the PCRE regular expression libraries. This concise and easy-to-use reference puts a very powerful tool for manipulating text and data right at your fingertips. Composed of a mixture of symbols and text, regular expressions can be an outlet for creativity, for brilliant programming, and for the elegant solution. Regular Expression Pocket Reference offers an introduction to regular expressions, pattern matching, metacharacters, modes and constructs, and then provides separate sections for each of the language APIs, with complete regex listings including:Supported metacharacters for each language API Regular expression classes and interfaces for Ruby, Java, .NET, and C# Regular expression operators for Perl 5.8 Regular expression module objects and functions for Python Pattern-matching functions for PHP and the vi editor Pattern-matching methods and objects for JavaScript Unicode Support for each of the languages With plenty of examples and other resources, Regular Expression Pocket Reference summarizes the complex rules for performing this critical text-processing function, and presents this often-confusing topic in a friendly and well-organized format. This guide makes an ideal on-the-job companion.
Algorithms Illuminated (Part 1): The Basics
Tim Roughgarden - 2017
Their applications range from network routing and computational genomics to public-key cryptography and database system implementation. Studying algorithms can make you a better programmer, a clearer thinker, and a master of technical interviews. Algorithms Illuminated is an accessible introduction to the subject---a transcript of what an expert algorithms tutor would say over a series of one-on-one lessons. The exposition is rigorous but emphasizes the big picture and conceptual understanding over low-level implementation and mathematical details. Part 1 of the book series covers asymptotic analysis and big-O notation, divide-and-conquer algorithms and the master method, randomized algorithms, and several famous algorithms for sorting and selection.
Data Visualization: A Practical Introduction
Kieran Healy - 2018
It explains what makes some graphs succeed while others fail, how to make high-quality figures from data using powerful and reproducible methods, and how to think about data visualization in an honest and effective way.Data Visualization builds the reader's expertise in ggplot2, a versatile visualization library for the R programming language. Through a series of worked examples, this accessible primer then demonstrates how to create plots piece by piece, beginning with summaries of single variables and moving on to more complex graphics. Topics include plotting continuous and categorical variables; layering information on graphics; producing effective "small multiple" plots; grouping, summarizing, and transforming data for plotting; creating maps; working with the output of statistical models; and refining plots to make them more comprehensible.Effective graphics are essential to communicating ideas and a great way to better understand data. This book provides the practical skills students and practitioners need to visualize quantitative data and get the most out of their research findings.Provides hands-on instruction using R and ggplot2Shows how the "tidyverse" of data analysis tools makes working with R easier and more consistentIncludes a library of data sets, code, and functions
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition
Dan Jurafsky - 2000
This comprehensive work covers both statistical and symbolic approaches to language processing; it shows how they can be applied to important tasks such as speech recognition, spelling and grammar correction, information extraction, search engines, machine translation, and the creation of spoken-language dialog agents. The following distinguishing features make the text both an introduction to the field and an advanced reference guide.- UNIFIED AND COMPREHENSIVE COVERAGE OF THE FIELDCovers the fundamental algorithms of each field, whether proposed for spoken or written language, whether logical or statistical in origin.- EMPHASIS ON WEB AND OTHER PRACTICAL APPLICATIONSGives readers an understanding of how language-related algorithms can be applied to important real-world problems.- EMPHASIS ON SCIENTIFIC EVALUATIONOffers a description of how systems are evaluated with each problem domain.- EMPERICIST/STATISTICAL/MACHINE LEARNING APPROACHES TO LANGUAGE PROCESSINGCovers all the new statistical approaches, while still completely covering the earlier more structured and rule-based methods.
The Ethical Algorithm: The Science of Socially Aware Algorithm Design
Michael Kearns - 2019
Algorithms have made our lives more efficient, more entertaining, and, sometimes, better informed. At the same time, complex algorithms are increasingly violating the basic rights of individual citizens. Allegedly anonymized datasets routinely leak our most sensitive personal information; statistical models for everything from mortgages to college admissions reflect racial and gender bias. Meanwhile, users manipulate algorithms to "game" search engines, spam filters, online reviewing services, and navigation apps.Understanding and improving the science behind the algorithms that run our lives is rapidly becoming one of the most pressing issues of this century. Traditional fixes, such as laws, regulations and watchdog groups, have proven woefully inadequate. Reporting from the cutting edge of scientific research, The Ethical Algorithm offers a new approach: a set of principled solutions based on the emerging and exciting science of socially aware algorithm design. Michael Kearns and Aaron Roth explain how we can better embed human principles into machine code - without halting the advance of data-driven scientific exploration. Weaving together innovative research with stories of citizens, scientists, and activists on the front lines, The Ethical Algorithm offers a compelling vision for a future, one in which we can better protect humans from the unintended impacts of algorithms while continuing to inspire wondrous advances in technology.
Mathematics for 3D Game Programming and Computer Graphics
Eric Lengyel - 2001
Unfortunately, most programmers frequently have a limited understanding of these essential mathematics and physics concepts. MATHEMATICS AND PHYSICS FOR PROGRAMMERS, THIRD EDITION provides a simple but thorough grounding in the mathematics and physics topics that programmers require to write algorithms and programs using a non-language-specific approach. Applications and examples from game programming are included throughout, and exercises follow each chapter for additional practice. The book's companion website provides sample code illustrating the mathematical and physics topics discussed in the book.
Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures
Claus O. Wilke - 2019
But with the increasing power of visualization software today, scientists, engineers, and business analysts often have to navigate a bewildering array of visualization choices and options.This practical book takes you through many commonly encountered visualization problems, and it provides guidelines on how to turn large datasets into clear and compelling figures. What visualization type is best for the story you want to tell? How do you make informative figures that are visually pleasing? Author Claus O. Wilke teaches you the elements most critical to successful data visualization.Explore the basic concepts of color as a tool to highlight, distinguish, or represent a valueUnderstand the importance of redundant coding to ensure you provide key information in multiple waysUse the book's visualizations directory, a graphical guide to commonly used types of data visualizationsGet extensive examples of good and bad figuresLearn how to use figures in a document or report and how employ them effectively to tell a compelling story