High Performance MySQL: Optimization, Backups, Replication & Load Balancing


Jeremy D. Zawodny - 2004
    This book is an insider's guide to these little understood topics.Author Jeremy Zawodny has managed large numbers of MySQL servers for mission-critical work at Yahoo!, maintained years of contacts with the MySQL AB team, and presents regularly at conferences. Jeremy and Derek have spent months experimenting, interviewing major users of MySQL, talking to MySQL AB, benchmarking, and writing some of their own tools in order to produce the information in this book.In "High Performance MySQL" you will learn about MySQL indexing and optimization in depth so you can make better use of these key features. You will learn practical replication, backup, and load-balancing strategies with information that goes beyond available tools to discuss their effects in real-life environments. And you'll learn the supporting techniques you need to carry out these tasks, including advanced configuration, benchmarking, and investigating logs.Topics include: A review of configuration and setup optionsStorage engines and table typesBenchmarkingIndexesQuery OptimizationApplication DesignServer PerformanceReplicationLoad-balancingBackup and RecoverySecurity

Taming Text: How to Find, Organize, and Manipulate It


Grant S. Ingersoll - 2011
    This causes real problems for everyday users who need to make sense of all the information available, and for software engineers who want to make their text-based applications more useful and user-friendly. Whether building a search engine for a corporate website, automatically organizing email, or extracting important nuggets of information from the news, dealing with unstructured text can be daunting.Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. It explores how to automatically organize text, using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. This book gives examples illustrating each of these topics, as well as the foundations upon which they are built.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Information Theory: A Tutorial Introduction


James V. Stone - 2015
    In this richly illustrated book, accessible examples are used to show how information theory can be understood in terms of everyday games like '20 Questions', and the simple MatLab programs provided give hands-on experience of information theory in action. Written in a tutorial style, with a comprehensive glossary, this text represents an ideal primer for novices who wish to become familiar with the basic principles of information theory.Download chapter 1 from http://jim-stone.staff.shef.ac.uk/Boo...

Growing Rails Applications in Practice


Henning Koch - 2014
    

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Dive Into Python 3


Mark Pilgrim - 2009
    As in the original book, Dive Into Python, each chapter starts with a real, complete code sample, proceeds to pick it apart and explain the pieces, and then puts it all back together in a summary at the end.This book includes:Example programs completely rewritten to illustrate powerful new concepts now available in Python 3: sets, iterators, generators, closures, comprehensions, and much more A detailed case study of porting a major library from Python 2 to Python 3 A comprehensive appendix of all the syntactic and semantic changes in Python 3 This is the perfect resource for you if you need to port applications to Python 3, or if you like to jump into languages fast and get going right away.

Decision Trees and Random Forests: A Visual Introduction For Beginners: A Simple Guide to Machine Learning with Decision Trees


Chris Smith - 2017
     They are also used in countless industries such as medicine, manufacturing and finance to help companies make better decisions and reduce risk. Whether coded or scratched out by hand, both algorithms are powerful tools that can make a significant impact. This book is a visual introduction for beginners that unpacks the fundamentals of decision trees and random forests. If you want to dig into the basics with a visual twist plus create your own machine learning algorithms in Python, this book is for you.

Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work


Q. Ethan McCallum - 2012
    In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.Among the many topics covered, you’ll discover how to:Test drive your data to see if it’s ready for analysisWork spreadsheet data into a usable formHandle encoding problems that lurk in text dataDevelop a successful web-scraping effortUse NLP tools to reveal the real sentiment of online reviewsAddress cloud computing issues that can impact your analysis effortAvoid policies that create data analysis roadblocksTake a systematic approach to data quality analysis

Data Analysis Using SQL and Excel


Gordon S. Linoff - 2007
    This book helps you use SQL and Excel to extract business information from relational databases and use that data to define business dimensions, store transactions about customers, produce results, and more. Each chapter explains when and why to perform a particular type of business analysis in order to obtain useful results, how to design and perform the analysis using SQL and Excel, and what the results should look like.

Black Hat Python: Python Programming for Hackers and Pentesters


Justin Seitz - 2014
    But just how does the magic happen?In Black Hat Python, the latest from Justin Seitz (author of the best-selling Gray Hat Python), you'll explore the darker side of Python's capabilities writing network sniffers, manipulating packets, infecting virtual machines, creating stealthy trojans, and more. You'll learn how to:Create a trojan command-and-control using GitHubDetect sandboxing and automate common malware tasks, like keylogging and screenshottingEscalate Windows privileges with creative process controlUse offensive memory forensics tricks to retrieve password hashes and inject shellcode into a virtual machineExtend the popular Burp Suite web-hacking toolAbuse Windows COM automation to perform a man-in-the-browser attackExfiltrate data from a network most sneakilyInsider techniques and creative challenges throughout show you how to extend the hacks and how to write your own exploits.When it comes to offensive security, your ability to create powerful tools on the fly is indispensable. Learn how in Black Hat Python."

Schaum's Outline of Theory and Problems of Data Structures


Seymour Lipschutz - 1986
    This guide, which can be used with any text or can stand alone, contains at the beginning of each chapter a list of key definitions, a summary of major concepts, step by step solutions to dozens of problems, and additional practice problems.

The Art of Data Science: A Guide for Anyone Who Works with Data


Roger D. Peng - 2015
    The authors have extensive experience both managing data analysts and conducting their own data analyses, and have carefully observed what produces coherent results and what fails to produce useful insights into data. This book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science.

HBase: The Definitive Guide


Lars George - 2011
    As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away. Discover how tight integration with Hadoop makes scalability with HBase easier Distribute large datasets across an inexpensive cluster of commodity servers Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks

Programming Rust: Fast, Safe Systems Development


Jim Blandy - 2015
    Rust's modern, flexible types ensure your program is free of null pointer dereferences, double frees, dangling pointers, and similar bugs, all at compile time, without runtime overhead. In multi-threaded code, Rust catches data races at compile time, making concurrency much easier to use.Written by two experienced systems programmers, this book explains how Rust manages to bridge the gap between performance and safety, and how you can take advantage of it. Topics include:How Rust represents values in memory (with diagrams)Complete explanations of ownership, moves, borrows, and lifetimesCargo, rustdoc, unit tests, and how to publish your code on crates.io, Rust's public package repositoryHigh-level features like generic code, closures, collections, and iterators that make Rust productive and flexibleConcurrency in Rust: threads, mutexes, channels, and atomics, all much safer to use than in C or C++Unsafe code, and how to preserve the integrity of ordinary code that uses itExtended examples illustrating how pieces of the language fit together

Build Your Own Database Driven Website Using PHP & MySQL


Kevin Yank - 2001
    There has been a marked increase in the adoption of PHP, most notably in the beginning to intermediate levels. PHP now boasts over 30% of the server side scripting market (Source: php.weblogs.com).The previous edition sold over 17,000 copies exclusively through Sitepoint.com alone. With the release of PHP 5, SitePoint have updated this bestseller to reflect best practice web development using PHP 5 and MySQL 4.The 3rd Edition includes more code examples and also a new bonus chapter on structured PHP Programming which introduces techniques for organizing real world PHP applications to avoid code duplication and ensure code is manageable and maintainable. The chapter introduces features like include files, user-defined function libraries and constants, which are combined to produce a fully functional access control system suitable for use on any PHP Website.