Feature Engineering for Machine Learning


Alice Zheng - 2018
    With this practical book, you’ll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.

Secrets and Lies: Digital Security in a Networked World


Bruce Schneier - 2000
    Identity Theft. Corporate Espionage. National secrets compromised. Can anyone promise security in our digital world?The man who introduced cryptography to the boardroom says no. But in this fascinating read, he shows us how to come closer by developing security measures in terms of context, tools, and strategy. Security is a process, not a product – one that system administrators and corporate executives alike must understand to survive.This edition updated with new information about post-9/11 security.

Cryptography Engineering: Design Principles and Practical Applications


Niels Ferguson - 2010
    Cryptography is vital to keeping information safe, in an era when the formula to do so becomes more and more challenging. Written by a team of world-renowned cryptography experts, this essential guide is the definitive introduction to all major areas of cryptography: message security, key negotiation, and key management. You'll learn how to think like a cryptographer. You'll discover techniques for building cryptography into products from the start and you'll examine the many technical changes in the field.After a basic overview of cryptography and what it means today, this indispensable resource covers such topics as block ciphers, block modes, hash functions, encryption modes, message authentication codes, implementation issues, negotiation protocols, and more. Helpful examples and hands-on exercises enhance your understanding of the multi-faceted field of cryptography.An author team of internationally recognized cryptography experts updates you on vital topics in the field of cryptography Shows you how to build cryptography into products from the start Examines updates and changes to cryptography Includes coverage on key servers, message security, authentication codes, new standards, block ciphers, message authentication codes, and more Cryptography Engineering gets you up to speed in the ever-evolving field of cryptography.

R Graphics Cookbook: Practical Recipes for Visualizing Data


Winston Chang - 2012
    Each recipe tackles a specific problem with a solution you can apply to your own project, and includes a discussion of how and why the recipe works.Most of the recipes use the ggplot2 package, a powerful and flexible way to make graphs in R. If you have a basic understanding of the R language, you're ready to get started.Use R's default graphics for quick exploration of dataCreate a variety of bar graphs, line graphs, and scatter plotsSummarize data distributions with histograms, density curves, box plots, and other examplesProvide annotations to help viewers interpret dataControl the overall appearance of graphicsRender data groups alongside each other for easy comparisonUse colors in plotsCreate network graphs, heat maps, and 3D scatter plotsStructure data for graphing

Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites


Matthew A. Russell - 2011
    You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.Get a straightforward synopsis of the social web landscapeUse adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+Learn how to employ easy-to-use Python tools to slice and dice the data you collectExplore social connections in microformats with the XHTML Friends NetworkApply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detectionBuild interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data." --Alex Martelli, Senior Staff Engineer, Google

Scala Cookbook


Alvin Alexander - 2013
    With more than 250 ready-to-use recipes and 700 code examples, this comprehensive cookbook covers the most common problems you’ll encounter when using the Scala language, libraries, and tools. It’s ideal not only for experienced Scala developers, but also for programmers learning to use this JVM language.Author Alvin Alexander (creator of DevDaily.com) provides solutions based on his experience using Scala for highly scalable, component-based applications that support concurrency and distribution. Packed with real-world scenarios, this book provides recipes for:Strings, numeric types, and control structuresClasses, methods, objects, traits, and packagingFunctional programming in a variety of situationsCollections covering Scala's wealth of classes and methodsConcurrency, using the Akka Actors libraryUsing the Scala REPL and the Simple Build Tool (SBT)Web services on both the client and server sidesInteracting with SQL and NoSQL databasesBest practices in Scala development

An Introduction to Statistical Learning: With Applications in R


Gareth James - 2013
    This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree- based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.

Absolute OpenBSD: Unix for the Practical Paranoid


Michael W. Lucas - 2003
    The author assumes a knowledge of basic UNIX commands, design, and permissions. The book takes you through the intricacies of the platform and teaches how to manage your system, offering friendly explanations, background information, troubleshooting suggestions, and copious examples throughout.

Hacking: The Art of Exploitation


Jon Erickson - 2003
    This book explains the technical aspects of hacking, including stack based overflows, heap based overflows, string exploits, return-into-libc, shellcode, and cryptographic attacks on 802.11b.

Learning the vi and Vim Editors


Arnold Robbins - 1987
    Editors are the subject of adoration and worship, or of scorn and ridicule, depending upon whether the topic of discussion is your editor or someone else's.vi has been the standard editor for close to 30 years. Popular on Unix and Linux, it has a growing following on Windows systems, too. Most experienced system administrators cite vi as their tool of choice. And since 1986, this book has been the guide for vi. However, Unix systems are not what they were 30 years ago, and neither is this book. While retaining all the valuable features of previous editions, the 7th edition of Learning the vi and vim Editors has been expanded to include detailed information on vim, the leading vi clone. vim is the default version of vi on most Linux systems and on Mac OS X, and is available for many other operating systems too. With this guide, you learn text editing basics and advanced tools for both editors, such as multi-window editing, how to write both interactive macros and scripts to extend the editor, and power tools for programmers -- all in the easy-to-follow style that has made this book a classic.Learning the vi and vim Editors includes:A complete introduction to text editing with vi:How to move around vi in a hurry Beyond the basics, such as using buffers vi's global search and replacement Advanced editing, including customizing vi and executing Unix commandsHow to make full use of vim: Extended text objects and more powerful regular expressions Multi-window editing and powerful vim scripts How to make full use of the GUI version of vim, called gvim vim's enhancements for programmers, such as syntax highlighting, folding and extended tags Coverage of three other popular vi clones -- nvi, elvis, and vile -- is also included. You'll find several valuable appendixes, including an alphabetical quick reference to both vi and ex mode commands for regular vi and for vim, plus an updated appendix on vi and the Internet. Learning either vi or vim is required knowledge if you use Linux or Unix, and in either case, reading this book is essential. After reading this book, the choice of editor will be obvious for you too.

Grokking Algorithms An Illustrated Guide For Programmers and Other Curious People


Aditya Y. Bhargava - 2015
    The algorithms you'll use most often as a programmer have already been discovered, tested, and proven. If you want to take a hard pass on Knuth's brilliant but impenetrable theories and the dense multi-page proofs you'll find in most textbooks, this is the book for you. This fully-illustrated and engaging guide makes it easy for you to learn how to use algorithms effectively in your own programs.Grokking Algorithms is a disarming take on a core computer science topic. In it, you'll learn how to apply common algorithms to the practical problems you face in day-to-day life as a programmer. You'll start with problems like sorting and searching. As you build up your skills in thinking algorithmically, you'll tackle more complex concerns such as data compression or artificial intelligence. Whether you're writing business software, video games, mobile apps, or system utilities, you'll learn algorithmic techniques for solving problems that you thought were out of your grasp. For example, you'll be able to:Write a spell checker using graph algorithmsUnderstand how data compression works using Huffman codingIdentify problems that take too long to solve with naive algorithms, and attack them with algorithms that give you an approximate answer insteadEach carefully-presented example includes helpful diagrams and fully-annotated code samples in Python. By the end of this book, you will know some of the most widely applicable algorithms as well as how and when to use them.

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark


Holden Karau - 2017
    But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.With this book, you'll explore:How Spark SQL's new interfaces improve performance over SQL's RDD data structureThe choice between data joins in Core Spark and Spark SQLTechniques for getting the most out of standard RDD transformationsHow to work around performance issues in Spark's key/value pair paradigmWriting high-performance Spark code without Scala or the JVMHow to test for functionality and performance when applying suggested improvementsUsing Spark MLlib and Spark ML machine learning librariesSpark's Streaming components and external community packages

Jenkins: The Definitive Guide


John Ferguson Smart - 2011
    This complete guide shows you how to automate your build, integration, release, and deployment processes with Jenkins—and demonstrates how CI can save you time, money, and many headaches. Ideal for developers, software architects, and project managers, Jenkins: The Definitive Guide is both a CI tutorial and a comprehensive Jenkins reference. Through its wealth of best practices and real-world tips, you'll discover how easy it is to set up a CI service with Jenkins. Learn how to install, configure, and secure your Jenkins server Organize and monitor general-purpose build jobs Integrate automated tests to verify builds, and set up code quality reporting Establish effective team notification strategies and techniques Configure build pipelines, parameterized jobs, matrix builds, and other advanced jobs Manage a farm of Jenkins servers to run distributed builds Implement automated deployment and continuous delivery

The Visual Display of Quantitative Information


Edward R. Tufte - 1983
    Theory and practice in the design of data graphics, 250 illustrations of the best (and a few of the worst) statistical graphics, with detailed analysis of how to display data for precise, effective, quick analysis. Design of the high-resolution displays, small multiples. Editing and improving graphics. The data-ink ratio. Time-series, relational graphics, data maps, multivariate designs. Detection of graphical deception: design variation vs. data variation. Sources of deception. Aesthetics and data graphical displays. This is the second edition of The Visual Display of Quantitative Information. Recently published, this new edition provides excellent color reproductions of the many graphics of William Playfair, adds color to other images, and includes all the changes and corrections accumulated during 17 printings of the first edition.

The R Book


Michael J. Crawley - 2007
    The R language is recognised as one of the most powerful and flexible statistical software packages, and it enables the user to apply many statistical techniques that would be impossible without such software to help implement such large data sets.