Web Scraping with Python: Collecting Data from the Modern Web


Ryan Mitchell - 2015
    With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice. Learn how to parse complicated HTML pages Traverse multiple pages and sites Get a general overview of APIs and how they work Learn several methods for storing the data you scrape Download, read, and extract data from documents Use tools and techniques to clean badly formatted data Read and write natural languages Crawl through forms and logins Understand how to scrape JavaScript Learn image processing and text recognition

The Visual Display of Quantitative Information


Edward R. Tufte - 1983
    Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.

Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, Lego, and Rubber Ducks


Will Kurt - 2019
    But many people use data in ways they don't even understand, meaning they aren't getting the most from it. Bayesian Statistics the Fun Way will change that.This book will give you a complete understanding of Bayesian statistics through simple explanations and un-boring examples. Find out the probability of UFOs landing in your garden, how likely Han Solo is to survive a flight through an asteroid shower, how to win an argument about conspiracy theories, and whether a burglary really was a burglary, to name a few examples.By using these off-the-beaten-track examples, the author actually makes learning statistics fun. And you'll learn real skills, like how to:- How to measure your own level of uncertainty in a conclusion or belief- Calculate Bayes theorem and understand what it's useful for- Find the posterior, likelihood, and prior to check the accuracy of your conclusions- Calculate distributions to see the range of your data- Compare hypotheses and draw reliable conclusions from themNext time you find yourself with a sheaf of survey results and no idea what to do with them, turn to Bayesian Statistics the Fun Way to get the most value from your data.

Debugging Teams: Better Productivity Through Collaboration


Brian W. Fitzpatrick - 2015
    Their conclusion? Even among people who have spent decades learning the technical side of their jobs, most haven't really focused on the human component. Learning to collaborate is just as important to success. If you invest in the soft skills of your job, you can have a much greater impact for the same amount of effort.The authors share their insights on how to lead a team effectively, navigate an organization, and build a healthy relationship with the users of your software. This is valuable information from two respected software engineers whose popular series of talks--including Working with Poisonous People--has attracted hundreds of thousands of followers.

Think Stats


Allen B. Downey - 2011
    This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data

Let Over Lambda


Doug Hoyte - 2008
    Starting with the fundamentals, it describes the most advanced features of the most advanced language: Common Lisp. Only the top percentile of programmers use lisp and if you can understand this book you are in the top percentile of lisp programmers. If you are looking for a dry coding manual that re-hashes common-sense techniques in whatever langue du jour, this book is not for you. This book is about pushing the boundaries of what we know about programming. While this book teaches useful skills that can help solve your programming problems today and now, it has also been designed to be entertaining and inspiring. If you have ever wondered what lisp or even programming itself is really about, this is the book you have been looking for.

Working at the Ubuntu Command-Line Prompt


Keir Thomas - 2011
    His books have been read by over 1,000,000 people and are #1 best-sellers. His book Beginning Ubuntu Linux recently entered its sixth edition, and picked-up a Linux Journal award along the way. Thomas is also the author of Ubuntu Kung Fu. * * * * * * * * * * * * * * * * * Get to grips with the Ubuntu command-line with this #1 best-selling and concise guide. "Best buck I've spent yet" — Amazon review.* Readable, accessible and easy to understand;* Learn essential Ubuntu vocational skills, or read just for fun;* Covers Ubuntu commands, syntax, the filesystem, plus advanced techniques;* For ANY version of Linux based on Debian, such as Linux Mint--not just Ubuntu!;* Includes BONUS introduction to Ubuntu chapter, plus a glossary appendix and a guide to reading Linux/Unix documentation.

VMware vSphere 5 Clustering Technical Deepdive


Frank Denneman - 2011
    It covers the basic steps needed to create a vSphere HA and vSphere DRS cluster and to implement vSphere Storage DRS. Even more important, it explains the concepts and mechanisms behind HA, DRS and Storage DRS which will enable you to make well educated decisions. This book will take you in to the trenches of HA, DRS and Storage DRS and will give you the tools to understand and implement e.g. HA admission control policies, DRS resource pools, Datastore Clusters and resource allocation settings. On top of that each section contains basic design principles that can be used for designing, implementing or improving VMware infrastructures and fundamental supporting features like (Storage) vMotion, Storage I/O Control and much more are described in detail for the very first time. This book is also the ultimate guide to be prepared for any HA, DRS or Storage DRS related question or case study that might be presented during VMware VCDX, VCP and or VCAP exams.Coverage includes: HA node types HA isolation detection and response HA admission control VM Monitoring HA and DRS integration DRS imbalance algorithm Resource Pools Impact of reservations and limits CPU Resource Scheduling Memory Scheduler DPM Datastore Clusters Storage DRS algorithm Influencing SDRS recommendationsBe prepared to dive deep!

Chaos Engineering


Casey Rosenthal - 2017
    You’ll never be able to prevent all possible failure modes, but you can identify many of the weaknesses in your system before they’re triggered by these events. This report introduces you to Chaos Engineering, a method of experimenting on infrastructure that lets you expose weaknesses before they become a real problem.Members of the Netflix team that developed Chaos Engineering explain how to apply these principles to your own system. By introducing controlled experiments, you’ll learn how emergent behavior from component interactions can cause your system to drift into an unsafe, chaotic state.- Hypothesize about steady state by collecting data on the health of the system- Vary real-world events by turning off a server to simulate regional failures- Run your experiments as close to the production environment as possible- Ramp up your experiment by automating it to run continuously- Minimize the effects of your experiments to keep from blowing everything up- Learn the process for designing chaos engineering experiments- Use the Chaos Maturity Model to map the state of your chaos program, including realistic goals

Blockchain Basics: A Non-Technical Introduction in 25 Steps


Daniel Drescher - 2017
    No mathematical formulas, program code, or computer science jargon are used.No previous knowledge in computer science, mathematics, programming, or cryptography is required. Terminology is explained through pictures, analogies, and metaphors.This book bridges the gap that exists between purely technical books about the blockchain and purely business-focused books. It does so by explaining both the technical concepts that make up the blockchain and their role in business-relevant applications.What You Will Learn: • What the blockchain is• Why it is needed and what problem it solves• Why there is so much excitement about the blockchain and its potential• Major components and their purpose• How various components of the blockchain work and interact• Limitations, why they exist, and what has been done to overcome them• Major application scenariosWho This Book Is For: Everyone who wants to get a general idea of what blockchain technology is, how it works, and how it will potentially change the financial system as we know it.

Python Machine Learning


Sebastian Raschka - 2015
    We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

Programming Collective Intelligence: Building Smart Web 2.0 Applications


Toby Segaran - 2002
    With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:Collaborative filtering techniques that enable online retailers to recommend products or media Methods of clustering to detect groups of similar items in a large dataset Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm Optimization algorithms that search millions of possible solutions to a problem and choose the best one Bayesian filtering, used in spam filters for classifying documents based on word types and other features Using decision trees not only to make predictions, but to model the way decisions are made Predicting numerical values rather than classifications to build price models Support vector machines to match people in online dating sites Non-negative matrix factorization to find the independent features in a dataset Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a game Each chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."-- Tim Wolters, CTO, Collective Intellect

The Elements of Statistical Learning: Data Mining, Inference, and Prediction


Trevor Hastie - 2001
    With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Hands-On Programming with R: Write Your Own Functions and Simulations


Garrett Grolemund - 2014
    With this book, you'll learn how to load data, assemble and disassemble data objects, navigate R's environment system, write your own functions, and use all of R's programming tools.RStudio Master Instructor Garrett Grolemund not only teaches you how to program, but also shows you how to get more from R than just visualizing and modeling data. You'll gain valuable programming skills and support your work as a data scientist at the same time.Work hands-on with three practical data analysis projects based on casino gamesStore, retrieve, and change data values in your computer's memoryWrite programs and simulations that outperform those written by typical R usersUse R programming tools such as if else statements, for loops, and S3 classesLearn how to write lightning-fast vectorized R codeTake advantage of R's package system and debugging toolsPractice and apply R programming concepts as you learn them

Windows 8.1 For Dummies


Andy Rathbone - 2013
    Parts cover: Windows 8.1 Stuff Everybody Thinks You Already Know - an introduction to the dual interfaces, basic mechanics, file storage, and instruction on how to get the free upgrade to Windows 8.1.Working with Programs, Apps and Files - the basics of finding and launching apps, getting help, and printingGetting Things Done on the Internet - instructions for connecting a Windows 8.1 device, using web and social apps, and maintaining privacyCustomizing and Upgrading Windows 8.1 - Windows 8.1 offers big changes to what a user can customize on the OS. This section shows how to manipulate app tiles, give Windows the look you in, set up boot-to-desktop capabilities, connect to a network, and create user accounts.Music, Photos and Movies - Windows 8.1 offers new apps and capabilities for working with onboard and online media, all covered in this chapterHelp! - includes guidance on how to fix common problems, interpret strange messages, move files to a new PC, and use the built-in help systemThe Part of Tens - quick tips for avoiding common annoyances and working with Windows 8.1 on a touch device