High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark


Holden Karau - 2017
    But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.With this book, you'll explore:How Spark SQL's new interfaces improve performance over SQL's RDD data structureThe choice between data joins in Core Spark and Spark SQLTechniques for getting the most out of standard RDD transformationsHow to work around performance issues in Spark's key/value pair paradigmWriting high-performance Spark code without Scala or the JVMHow to test for functionality and performance when applying suggested improvementsUsing Spark MLlib and Spark ML machine learning librariesSpark's Streaming components and external community packages

Algorithms in a Nutshell


George T. Heineman - 2008
    Algorithms in a Nutshell describes a large number of existing algorithms for solving a variety of problems, and helps you select and implement the right algorithm for your needs -- with just enough math to let you understand and analyze algorithm performance. With its focus on application, rather than theory, this book provides efficient code solutions in several programming languages that you can easily adapt to a specific project. Each major algorithm is presented in the style of a design pattern that includes information to help you understand why and when the algorithm is appropriate. With this book, you will:Solve a particular coding problem or improve on the performance of an existing solutionQuickly locate algorithms that relate to the problems you want to solve, and determine why a particular algorithm is the right one to useGet algorithmic solutions in C, C++, Java, and Ruby with implementation tipsLearn the expected performance of an algorithm, and the conditions it needs to perform at its bestDiscover the impact that similar design decisions have on different algorithmsLearn advanced data structures to improve the efficiency of algorithmsWith Algorithms in a Nutshell, you'll learn how to improve the performance of key algorithms essential for the success of your software applications.

SSH, The Secure Shell: The Definitive Guide


Daniel J. Barrett - 2001
    It supports secure remote logins, secure file transfer between computers, and a unique "tunneling" capability that adds encryption to otherwise insecure network applications. Best of all, SSH is free, with feature-filled commercial versions available as well.SSH: The Secure Shell: The Definitive Guide covers the Secure Shell in detail for both system administrators and end users. It demystifies the SSH man pages and includes thorough coverage of:SSH1, SSH2, OpenSSH, and F-Secure SSH for Unix, plus Windows and Macintosh products: the basics, the internals, and complex applications.Configuring SSH servers and clients, both system-wide and per user, with recommended settings to maximize security.Advanced key management using agents, agent forwarding, and forced commands.Forwarding (tunneling) of TCP and X11 applications in depth, even in the presence of firewalls and network address translation (NAT).Undocumented behaviors of popular SSH implementations.Installing and maintaining SSH systems.Whether you're communicating on a small LAN or across the Internet, SSH can ship your data from "here" to "there" efficiently and securely. So throw away those insecure .rhosts and hosts.equiv files, move up to SSH, and make your network a safe place to live and work.

Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists


Philipp K. Janert - 2010
    With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora

Machine Learning in Action


Peter Harrington - 2011
    "Machine learning," the process of automating tasks once considered the domain of highly-trained analysts and mathematicians, is the key to efficiently extracting useful information from this sea of raw data. Machine Learning in Action is a unique book that blends the foundational theories of machine learning with the practical realities of building tools for everyday data analysis. In it, the author uses the flexible Python programming language to show how to build programs that implement algorithms for data classification, forecasting, recommendations, and higher-level features like summarization and simplification.

Doing Data Science


Cathy O'Neil - 2013
    But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data


Hadley Wickham - 2016
    This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way. You’ll learn how to: Wrangle—transform your datasets into a form convenient for analysis Program—learn powerful R tools for solving data problems with greater clarity and ease Explore—examine your data, generate hypotheses, and quickly test them Model—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results

Machine Learning for Hackers


Drew Conway - 2012
    Authors Drew Conway and John Myles White help you understand machine learning and statistics tools through a series of hands-on case studies, instead of a traditional math-heavy presentation.Each chapter focuses on a specific problem in machine learning, such as classification, prediction, optimization, and recommendation. Using the R programming language, you'll learn how to analyze sample datasets and write simple machine learning algorithms. "Machine Learning for Hackers" is ideal for programmers from any background, including business, government, and academic research.Develop a naive Bayesian classifier to determine if an email is spam, based only on its textUse linear regression to predict the number of page views for the top 1,000 websitesLearn optimization techniques by attempting to break a simple letter cipherCompare and contrast U.S. Senators statistically, based on their voting recordsBuild a "whom to follow" recommendation system from Twitter data

Arduino Cookbook: Recipes to Begin, Expand, and Enhance Your Projects


Michael Margolis - 2010
    You'll find more than 200 tips and techniques for building a variety of objects and prototypes such as IoT solutions, environmental monitors, location and position-aware systems, and products that can respond to touch, sound, heat, and light.Updated for the Arduino 1.8 release, the recipes in this third edition include practical examples and guidance to help you begin, expand, and enhance your projects right away--whether you're an engineer, designer, artist, student, or hobbyist.Get up to speed on the Arduino board and essential software concepts quicklyLearn basic techniques for reading digital and analog signalsUse Arduino with a variety of popular input devices and sensorsDrive visual displays, generate sound, and control several types of motorsConnect Arduino to wired and wireless networksLearn techniques for handling time delays and time measurementApply advanced coding and memory-handling techniques

Maven: The Definitive Guide


Timothy O'Brien - 2008
    Now there's help. The long-awaited official documentation to Maven is here. Written by Maven creator Jason Van Zyl and his team at Sonatype, Maven: The Definitive Guide clearly explains how this tool can bring order to your software development projects. Maven is largely replacing Ant as the build tool of choice for large open source Java projects because, unlike Ant, Maven is also a project management tool that can run reports, generate a project website, and facilitate communication among members of a working team. To use Maven, everything you need to know is in this guide. The first part demonstrates the tool's capabilities through the development, from ideation to deployment, of several sample applications -- a simple software development project, a simple web application, a multi-module project, and a multi-module enterprise project. The second part offers a complete reference guide that includes:The POM and Project Relationships The Build Lifecycle Plugins Project website generation Advanced site generation Reporting Properties Build Profiles The Maven Repository Team Collaboration Writing Plugins IDEs such as Eclipse, IntelliJ, ands NetBeans Using and creating assemblies Developing with Maven ArchetypesSeveral sources for Maven have appeared online for some time, but nothing served as an introduction and comprehensive reference guide to this tool -- until now. Maven: The Definitive Guide is the ideal book to help you manage development projects for software, web applications, and enterprise applications. And it comes straight from the source.

Bike Repair and Maintenance for Dummies


Dennis Bailey - 2009
    If you have a little or a lot of experience in using tools on your bike, this book can show you how to keep your bike in top working order, from tires to handlebars, without all the technical jargon.If biking is already a part of your life - or you'd like it to be - this book can help you tackle your own bike maintenance and repair, so you don't have to take it to the shop for routine tune-ups or call for help if you break down in the middle of nowhere. Of course, sometimes you'll need to seek expert help, so the book covers when to attack a problem yourself and when to call in the pros for backup.And although this book is written in easy-to-understand language without a lot of biking jargon, Bike Repair & Maintenance For Dummies is still a comprehensive guide. Seasoned bike riders looking for additional tips and tricks to keep their bikes in top condition won't be disappointed.This book will help you repair - and, if necessary, replace - the parts on your bicycle. You'll discover how to make basic bike repairs, such as:Removing a wheel, tire, or tubePatching a tube or fixing a tireWorking on hubs and spokesInstalling new brakes and pads or addressing other brake issuesAdjusting your saddleUsing suspension seat postsDealing with common chain problemsInspecting, cleaning, and lubricating cassettes and freewheelsAfter you nail the basics, you can dive into advanced repairs and maintenance, including:Knowing how a frame is built and inspecting one for problemsAdjusting and maintaining a bike's suspensionRemoving, installing, and adjusting the rear and front derailleursRemoving and installing shiftersTaping your handlebarsAdjusting and overhauling your headsetGet your copy of Bike Repair & Maintenance For Dummies to learn all of that, plus tips on staying safe, ensuring your bike is always a good fit for you, and improving your bike's performance.

Learning Java


Patrick Niemeyer - 1996
    With Java 5.0, you'll not only find substantial changes in the platform, but to the language itself-something that developers of Java took five years to complete. The main goal of Java 5.0 is to make it easier for you to develop safe, powerful code, but none of these improvements makes Java any easier to learn, even if you've programmed with Java for years. And that means our bestselling hands-on tutorial takes on even greater significance."Learning Java" is the most widely sought introduction to the programming language that's changed the way we think about computing. Our updated third edition takes an objective, no-nonsense approach to the new features in Java 5.0, some of which are drastically different from the way things were done in any previous versions. The most essential change is the addition of "generics," a feature that allows developers to write, test, and deploy code once, and then reuse the code again and again for different data types. The beauty of generics is that more problems will be caught during development, and "Learning Java" will show you exactly how it's done.Java 5.0 also adds more than 1,000 new classes to the Java library. That means 1,000 new things you can do without having to program it in yourself. That's a huge change. With our book's practical examples, you'll come up to speed quickly on this and other new features such as loops and threads. The new edition also includes an introduction to Eclipse, the open source IDE that is growing in popularity. "Learning Java," 3rd Edition addresses all of the important uses of Java, such as web applications, servlets, and XML that are increasingly driving enterprise applications.

Introducing Regular Expressions


Michael J. Fitzgerald - 2012
    You’ll learn the fundamentals step-by-step with the help of numerous examples, discovering first-hand how to match, extract, and transform text by matching specific words, characters, and patterns.Regular expressions are an essential part of a programmer’s toolkit, available in various Unix utlilities as well as programming languages such as Perl, Java, JavaScript, and C#. When you’ve finished this book, you’ll be familiar with the most commonly used syntax in regular expressions, and you’ll understand how using them will save you considerable time.Discover what regular expressions are and how they workLearn many of the differences between regular expressions used with command-line tools and in various programming languagesApply simple methods for finding patterns in text, including digits, letters, Unicode characters, and string literalsLearn how to use zero-width assertions and lookaroundsWork with groups, backreferences, character classes, and quantifiersUse regular expressions to mark up plain text with HTML5

Programming Scala: Scalability = Functional Programming + Objects


Dean Wampler - 2009
    With this book, you'll discover why Scala is ideal for highly scalable, component-based applications that support concurrency and distribution.Programming Scala clearly explains the advantages of Scala as a JVM language. You'll learn how to leverage the wealth of Java class libraries to meet the practical needs of enterprise and Internet projects more easily. Packed with code examples, this book provides useful information on Scala's command-line tools, third-party tools, libraries, and available language-aware plugins for editors and IDEs.Learn how Scala's succinct and flexible code helps you program fasterDiscover the notable improvements Scala offers over Java's object modelGet a concise overview of functional programming, and learn how Scala's support for it offers a better approach to concurrencyKnow how to use mixin composition with traits, pattern matching, concurrency with Actors, and other essential featuresTake advantage of Scala's built-in support for XMLLearn how to develop domain-specific languagesUnderstand the basics for designing test-driven Scala applications

Raspberry Pi User Guide


Eben Upton - 2012
    And now you can learn how to use this amazing computer from its co-creator, Eben Upton, in Raspberry Pi User Guide. Cowritten with Gareth Halfacree, this guide gets you up and running on Raspberry Pi, whether you're an educator, hacker, hobbyist, or kid. Learn how to connect your Pi to other hardware, install software, write basic programs, and set it up to run robots, multimedia centers, and more.Gets you up and running on Raspberry Pi, a high-tech computer the size of a credit card Helps educators teach students how to program Covers connecting Raspberry Pi to other hardware, such as monitors and keyboards, how to install software, and how to configure Raspberry Pi Shows you how to set up Raspberry Pi as a simple productivity computer, write basic programs in Python, connect to servos and sensors, and drive a robot or multimedia center Adults, kids, and devoted hardware hackers, now that you've got a Raspberry Pi, get the very most out of it with Raspberry Pi User Guide.