Advanced Analytics with Spark


Sandy Ryza - 2015
    

Streaming Systems


Tyler Akidau - 2018
    As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.Expanded from Tyler Akidau's popular blog posts Streaming 101 and Streaming 102, this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You'll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.You'll explore:How streaming and batch data processing patterns compareThe core principles and concepts behind robust out-of-order data processingHow watermarks track progress and completeness in infinite datasetsHow exactly-once data processing techniques ensure correctnessHow the concepts of streams and tables form the foundations of both batch and streaming data processingThe practical motivations behind a powerful persistent state mechanism, driven by a real-world exampleHow time-varying relations provide a link between stream processing and the world of SQL and relational algebra

Data Science at the Command Line: Facing the Future with Time-Tested Tools


Jeroen Janssens - 2014
    You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms

Cassandra: The Definitive Guide


Eben Hewitt - 2010
    Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.Understand the tenets of Cassandra's column-oriented structureLearn how to write, update, and read Cassandra dataDiscover how to add or remove nodes from the cluster as your application requiresExamine a working application that translates from a relational model to Cassandra's data modelUse examples for writing clients in Java, Python, and C#Use the JMX interface to monitor a cluster's usage, memory patterns, and moreTune memory settings, data storage, and caching for better performance

High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark


Holden Karau - 2017
    But if you haven't seen the performance improvements you expected, or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you'll also learn how to make it sing.With this book, you'll explore:How Spark SQL's new interfaces improve performance over SQL's RDD data structureThe choice between data joins in Core Spark and Spark SQLTechniques for getting the most out of standard RDD transformationsHow to work around performance issues in Spark's key/value pair paradigmWriting high-performance Spark code without Scala or the JVMHow to test for functionality and performance when applying suggested improvementsUsing Spark MLlib and Spark ML machine learning librariesSpark's Streaming components and external community packages

R in a Nutshell: A Desktop Quick Reference


Joseph Adler - 2009
    R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics.The author introduces you to the R environment, including the R graphical user interface and console, and takes you through the fundamentals of the object-oriented R language. Then, through a variety of practical examples from medicine, business, and sports, you'll learn how you can use this remarkable tool to solve your own data analysis problems.Understand the basics of the language, including the nature of R objectsLearn how to write R functions and build your own packagesWork with data through visualization, statistical analysis, and other methodsExplore the wealth of packages contributed by the R communityBecome familiar with the lattice graphics package for high-level data visualizationLearn about bioinformatics packages provided by Bioconductor"I am excited about this book. R in a Nutshell is a great introduction to R, as well as a comprehensive reference for using R in data analytics and visualization. Adler provides 'real world' examples, practical advice, and scripts, making it accessible to anyone working with data, not just professional statisticians."

Java 8 Lambdas: Pragmatic Functional Programming


Richard Warburton - 2014
    Starting with basic examples, this book is focused solely on Java 8 language changes and related API changes, so you don’t need to buy and read a 900 page book in order to brush up. Lambdas make a programmer's job easier, and this book will teach you how. Coverage includes introductory syntax for lambda expressions, method references that allow you to reuse existing named methods from your codebase, and the collection library in Java 8.

Arduino Cookbook: Recipes to Begin, Expand, and Enhance Your Projects


Michael Margolis - 2010
    You'll find more than 200 tips and techniques for building a variety of objects and prototypes such as IoT solutions, environmental monitors, location and position-aware systems, and products that can respond to touch, sound, heat, and light.Updated for the Arduino 1.8 release, the recipes in this third edition include practical examples and guidance to help you begin, expand, and enhance your projects right away--whether you're an engineer, designer, artist, student, or hobbyist.Get up to speed on the Arduino board and essential software concepts quicklyLearn basic techniques for reading digital and analog signalsUse Arduino with a variety of popular input devices and sensorsDrive visual displays, generate sound, and control several types of motorsConnect Arduino to wired and wireless networksLearn techniques for handling time delays and time measurementApply advanced coding and memory-handling techniques

Learning React Native: Building Native Mobile Apps with JavaScript


Bonnie Eisenman - 2016
    With this hands-on guide, you'll learn how to build applications that target iOS, Android, and other mobile platforms instead of browsers. You'll also discover how to access platform features such as the camera, user location, and local storage.With code examples and step-by-step instructions, author Bonnie Eisenman shows web developers and frontend engineers how to build and style interfaces, use mobile components, and debug and deploy apps. Along the way, you'll build several increasingly sophisticated sample apps with React Native before putting everything together at the end.Learn how React Native provides an interface to native UI componentsExamine how the framework uses native components analogous to HTML elementsCreate and style your own React Native components and applicationsInstall modules for APIs and features not supported by the frameworkGet tools for debugging your code, and for handling issues outside of JavaScriptPut it all together with the Zebreto effective-memorization flashcard appDeploy apps to the iOS App Store and Google's Play Store

Java Generics and Collections: Speed Up the Java Development Process


Maurice Naftalin - 2006
    Generics and the greatly expanded collection libraries have tremendously increased the power of Java 5 and Java 6. But they have also confused many developers who haven't known how to take advantage of these new features.Java Generics and Collections covers everything from the most basic uses of generics to the strangest corner cases. It teaches you everything you need to know about the collections libraries, so you'll always know which collection is appropriate for any given task, and how to use it.Topics covered include:• Fundamentals of generics: type parameters and generic methods• Other new features: boxing and unboxing, foreach loops, varargs• Subtyping and wildcards• Evolution not revolution: generic libraries with legacy clients and generic clients with legacy libraries• Generics and reflection• Design patterns for generics• Sets, Queues, Lists, Maps, and their implementations• Concurrent programming and thread safety with collections• Performance implications of different collectionsGenerics and the new collection libraries they inspired take Java to a new level. If you want to take your software development practice to a new level, this book is essential reading.Philip Wadler is Professor of Theoretical Computer Science at the University of Edinburgh, where his research focuses on the design of programming languages. He is a co-designer of GJ, work that became the basis for generics in Sun's Java 5.0.Maurice Naftalin is Technical Director at Morningside Light Ltd., a software consultancy in the United Kingdom. He has most recently served as an architect and mentor at NSB Retail Systems plc, and as the leader of the client development team of a major UK government social service system."A brilliant exposition of generics. By far the best book on the topic, it provides a crystal clear tutorial that starts with the basics and ends leaving the reader with a deep understanding of both the use and design of generics." Gilad Bracha, Java Generics Lead, Sun Microsystems

Learning Java


Patrick Niemeyer - 1996
    With Java 5.0, you'll not only find substantial changes in the platform, but to the language itself-something that developers of Java took five years to complete. The main goal of Java 5.0 is to make it easier for you to develop safe, powerful code, but none of these improvements makes Java any easier to learn, even if you've programmed with Java for years. And that means our bestselling hands-on tutorial takes on even greater significance."Learning Java" is the most widely sought introduction to the programming language that's changed the way we think about computing. Our updated third edition takes an objective, no-nonsense approach to the new features in Java 5.0, some of which are drastically different from the way things were done in any previous versions. The most essential change is the addition of "generics," a feature that allows developers to write, test, and deploy code once, and then reuse the code again and again for different data types. The beauty of generics is that more problems will be caught during development, and "Learning Java" will show you exactly how it's done.Java 5.0 also adds more than 1,000 new classes to the Java library. That means 1,000 new things you can do without having to program it in yourself. That's a huge change. With our book's practical examples, you'll come up to speed quickly on this and other new features such as loops and threads. The new edition also includes an introduction to Eclipse, the open source IDE that is growing in popularity. "Learning Java," 3rd Edition addresses all of the important uses of Java, such as web applications, servlets, and XML that are increasingly driving enterprise applications.

R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics


Paul Teetor - 2011
    The R language provides everything you need to do statistical work, but its structure can be difficult to master. This collection of concise, task-oriented recipes makes you productive with R immediately, with solutions ranging from basic tasks to input and output, general statistics, graphics, and linear regression.Each recipe addresses a specific problem, with a discussion that explains the solution and offers insight into how it works. If you're a beginner, R Cookbook will help get you started. If you're an experienced data programmer, it will jog your memory and expand your horizons. You'll get the job done faster and learn more about R in the process.Create vectors, handle variables, and perform other basic functionsInput and output dataTackle data structures such as matrices, lists, factors, and data framesWork with probability, probability distributions, and random variablesCalculate statistics and confidence intervals, and perform statistical testsCreate a variety of graphic displaysBuild statistical models with linear regressions and analysis of variance (ANOVA)Explore advanced statistical techniques, such as finding clusters in your dataWonderfully readable, R Cookbook serves not only as a solutions manual of sorts, but as a truly enjoyable way to explore the R language--one practical example at a time.--Jeffrey Ryan, software consultant and R package author

Windows PowerShell Cookbook: The Complete Guide to Scripting Microsoft's Command Shell


Lee Holmes - 2007
    Intermediate to advanced system administrators will find more than 100 tried-and-tested scripts they can copy and use immediately.Updated for PowerShell 3.0, this comprehensive cookbook includes hands-on recipes for common tasks and administrative jobs that you can apply whether you’re on the client or server version of Windows. You also get quick references to technologies used in conjunction with PowerShell, including format specifiers and frequently referenced registry keys to selected .NET, COM, and WMI classes.Learn how to use PowerShell on Windows 8 and Windows Server 2012Tour PowerShell’s core features, including the command model, object-based pipeline, and ubiquitous scriptingMaster fundamentals such as the interactive shell, pipeline, and object conceptsPerform common tasks that involve working with files, Internet-connected scripts, user interaction, and moreSolve tasks in systems and enterprise management, such as working with Active Directory and the filesystem

PHP Cookbook


David Sklar - 2002
    With our Cookbook's unique format, you can learn how to build dynamic web applications that work on any web browser. This revised new edition makes it easy to find specific solutions for programming challenges.PHP Cookbook has a wealth of solutions for problems that you'll face regularly. With topics that range from beginner questions to advanced web programming techniques, this guide contains practical examples -- or "recipes" -- for anyone who uses this scripting language to generate dynamic web content. Updated for PHP 5, this book provides solutions that explain how to use the new language features in detail, including the vastly improved object-oriented capabilities and the new PDO data access extension. New sections on classes and objects are included, along with new material on processing XML, building web services with PHP, and working with SOAP/REST architectures. With each recipe, the authors include a discussion that explains the logic and concepts underlying the solution.

Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions


Michael G. Milton - 2009
    If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to:Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.