Book picks similar to
High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark by Holden Karau
big-data
data-science
technical
computer-science
Doing Data Science
Cathy O'Neil - 2013
But how can you get started working in a wide-ranging, interdisciplinary field that’s so clouded in hype? This insightful book, based on Columbia University’s Introduction to Data Science class, tells you what you need to know.In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If you’re familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.Topics include:Statistical inference, exploratory data analysis, and the data science processAlgorithmsSpam filters, Naive Bayes, and data wranglingLogistic regressionFinancial modelingRecommendation engines and causalityData visualizationSocial networks and data journalismData engineering, MapReduce, Pregel, and HadoopDoing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy O’Neil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.
Scala Cookbook
Alvin Alexander - 2013
With more than 250 ready-to-use recipes and 700 code examples, this comprehensive cookbook covers the most common problems you’ll encounter when using the Scala language, libraries, and tools. It’s ideal not only for experienced Scala developers, but also for programmers learning to use this JVM language.Author Alvin Alexander (creator of DevDaily.com) provides solutions based on his experience using Scala for highly scalable, component-based applications that support concurrency and distribution. Packed with real-world scenarios, this book provides recipes for:Strings, numeric types, and control structuresClasses, methods, objects, traits, and packagingFunctional programming in a variety of situationsCollections covering Scala's wealth of classes and methodsConcurrency, using the Akka Actors libraryUsing the Scala REPL and the Simple Build Tool (SBT)Web services on both the client and server sidesInteracting with SQL and NoSQL databasesBest practices in Scala development
Windows PowerShell Cookbook: The Complete Guide to Scripting Microsoft's Command Shell
Lee Holmes - 2007
Intermediate to advanced system administrators will find more than 100 tried-and-tested scripts they can copy and use immediately.Updated for PowerShell 3.0, this comprehensive cookbook includes hands-on recipes for common tasks and administrative jobs that you can apply whether you’re on the client or server version of Windows. You also get quick references to technologies used in conjunction with PowerShell, including format specifiers and frequently referenced registry keys to selected .NET, COM, and WMI classes.Learn how to use PowerShell on Windows 8 and Windows Server 2012Tour PowerShell’s core features, including the command model, object-based pipeline, and ubiquitous scriptingMaster fundamentals such as the interactive shell, pipeline, and object conceptsPerform common tasks that involve working with files, Internet-connected scripts, user interaction, and moreSolve tasks in systems and enterprise management, such as working with Active Directory and the filesystem
Learning Python
Mark Lutz - 2003
Python is considered easy to learn, but there's no quicker way to mastery of the language than learning from an expert teacher. This edition of "Learning Python" puts you in the hands of two expert teachers, Mark Lutz and David Ascher, whose friendly, well-structured prose has guided many a programmer to proficiency with the language. "Learning Python," Second Edition, offers programmers a comprehensive learning tool for Python and object-oriented programming. Thoroughly updated for the numerous language and class presentation changes that have taken place since the release of the first edition in 1999, this guide introduces the basic elements of the latest release of Python 2.3 and covers new features, such as list comprehensions, nested scopes, and iterators/generators. Beyond language features, this edition of "Learning Python" also includes new context for less-experienced programmers, including fresh overviews of object-oriented programming and dynamic typing, new discussions of program launch and configuration options, new coverage of documentation sources, and more. There are also new use cases throughout to make the application of language features more concrete. The first part of "Learning Python" gives programmers all the information they'll need to understand and construct programs in the Python language, including types, operators, statements, classes, functions, modules and exceptions. The authors then present more advanced material, showing how Python performs common tasks by offering real applications and the libraries available for those applications. Each chapter ends with a series of exercises that will test your Python skills and measure your understanding."Learning Python," Second Edition is a self-paced book that allows readers to focus on the core Python language in depth. As you work through the book, you'll gain a deep and complete understanding of the Python language that will help you to understand the larger application-level examples that you'll encounter on your own. If you're interested in learning Python--and want to do so quickly and efficiently--then "Learning Python," Second Edition is your best choice.
Streaming Systems
Tyler Akidau - 2018
As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.Expanded from Tyler Akidau's popular blog posts Streaming 101 and Streaming 102, this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You'll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.You'll explore:How streaming and batch data processing patterns compareThe core principles and concepts behind robust out-of-order data processingHow watermarks track progress and completeness in infinite datasetsHow exactly-once data processing techniques ensure correctnessHow the concepts of streams and tables form the foundations of both batch and streaming data processingThe practical motivations behind a powerful persistent state mechanism, driven by a real-world exampleHow time-varying relations provide a link between stream processing and the world of SQL and relational algebra
Cloud Native Infrastructure: Patterns for Scalable Infrastructure and Applications in a Dynamic Environment
Justin Garrison - 2017
This practical guide shows you how to design and maintain infrastructure capable of managing the full lifecycle of these implementations.Engineers Justin Garrison (Walt Disney Animation Studios) and Kris Nova (Dies, Inc.) reveal hard-earned lessons on architecting infrastructure for massive scale and best in class monitoring, alerting, and troubleshooting. The authors focus on Cloud Native Computing Foundation projects and explain where each is crucial to managing modern applications.Understand the fundamentals of cloud native application design, and how it differs from traditional application designLearn how cloud native infrastructure is different from traditional infrastructureManage application lifecycles running on cloud native infrastructure, using Kubernetes for application deployment, scaling, and upgradesMonitor cloud native infrastructure and applications, using fluentd for logging and prometheus + graphana for visualizing dataDebug running applications and learn how to trace a distributed application and dig deep into a running system with OpenTracing
97 Things Every Programmer Should Know: Collective Wisdom from the Experts
Kevlin Henney - 2010
With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan
Jenkins: The Definitive Guide
John Ferguson Smart - 2011
This complete guide shows you how to automate your build, integration, release, and deployment processes with Jenkins—and demonstrates how CI can save you time, money, and many headaches.
Ideal for developers, software architects, and project managers, Jenkins: The Definitive Guide is both a CI tutorial and a comprehensive Jenkins reference. Through its wealth of best practices and real-world tips, you'll discover how easy it is to set up a CI service with Jenkins.
Learn how to install, configure, and secure your Jenkins server
Organize and monitor general-purpose build jobs
Integrate automated tests to verify builds, and set up code quality reporting
Establish effective team notification strategies and techniques
Configure build pipelines, parameterized jobs, matrix builds, and other advanced jobs
Manage a farm of Jenkins servers to run distributed builds
Implement automated deployment and continuous delivery
Lex & Yacc
John R. Levine - 1990
These tools help programmers build compilers and interpreters, but they also have a wider range of applications.The second edition contains completely revised tutorial sections for novice users and reference sections for advanced users. This edition is twice the size of the first and has an expanded index.The following material has been added:Each utility is explained in a chapter that covers basic usage and simple, stand-alone applications How to implement a full SQL grammar, with full sample code Major MS-DOS and Unix versions of lex and yacc are explored in depth, including AT&T lex and yacc, Berkeley yacc, Berkeley/GNU Flex, GNU Bison, MKS lex and yacc, and Abraxas PCYACC
Hadoop Explained
Aravind Shenoy - 2014
Hadoop allowed small and medium sized companies to store huge amounts of data on cheap commodity servers in racks. The introduction of Big Data has allowed businesses to make decisions based on quantifiable analysis. Hadoop is now implemented in major organizations such as Amazon, IBM, Cloudera, and Dell to name a few. This book introduces you to Hadoop and to concepts such as ‘MapReduce’, ‘Rack Awareness’, ‘Yarn’ and ‘HDFS Federation’, which will help you get acquainted with the technology.
Write Great Code: Volume 1: Understanding the Machine
Randall Hyde - 2004
A dirty little secret assembly language programmers rarely admit to, however, is that what you really need to learn is machine organization, not assembly language programming. Write Great Code Vol I, the first in a series from assembly language expert Randall Hyde, dives right into machine organization without the extra overhead of learning assembly language programming at the same time. And since Write Great Code Vol I concentrates on the machine organization, not assembly language, the reader will learn in greater depth those subjects that are language-independent and of concern to a high level language programmer. Write Great Code Vol I will help programmers make wiser choices with respect to programming statements and data types when writing software, no matter which language they use.
Data Analysis with Open Source Tools: A Hands-On Guide for Programmers and Data Scientists
Philipp K. Janert - 2010
With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.Use graphics to describe data with one, two, or dozens of variablesDevelop conceptual models using back-of-the-envelope calculations, as well asscaling and probability argumentsMine data with computationally intensive methods such as simulation and clusteringMake your conclusions understandable through reports, dashboards, and other metrics programsUnderstand financial calculations, including the time-value of moneyUse dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situationsBecome familiar with different open source programming environments for data analysisFinally, a concise reference for understanding how to conquer piles of data.--Austin King, Senior Web Developer, MozillaAn indispensable text for aspiring data scientists.--Michael E. Driscoll, CEO/Founder, Dataspora
Learning React Native: Building Native Mobile Apps with JavaScript
Bonnie Eisenman - 2016
With this hands-on guide, you'll learn how to build applications that target iOS, Android, and other mobile platforms instead of browsers. You'll also discover how to access platform features such as the camera, user location, and local storage.With code examples and step-by-step instructions, author Bonnie Eisenman shows web developers and frontend engineers how to build and style interfaces, use mobile components, and debug and deploy apps. Along the way, you'll build several increasingly sophisticated sample apps with React Native before putting everything together at the end.Learn how React Native provides an interface to native UI componentsExamine how the framework uses native components analogous to HTML elementsCreate and style your own React Native components and applicationsInstall modules for APIs and features not supported by the frameworkGet tools for debugging your code, and for handling issues outside of JavaScriptPut it all together with the Zebreto effective-memorization flashcard appDeploy apps to the iOS App Store and Google's Play Store
High Performance MySQL: Optimization, Backups, and Replication
Baron Schwartz - 2008
This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover.
Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL.
Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and views
Implement improvements in replication, high availability, and clustering
Achieve high performance when running MySQL in the cloud
Optimize advanced querying features, such as full-text searches
Take advantage of modern multi-core CPUs and solid-state disks
Explore backup and recovery strategies—including new tools for hot online backups
Restful Java with Jax-RS
Bill Burke - 2009
With this hands-on reference, you'll focus on implementation rather than theory, and discover why the RESTful method is far better than technologies like CORBA and SOAP. It's easy to get started with services based on the REST architecture. RESTful Java with JAX-RS includes a technical guide that explains REST and JAX-RS, how they work, and when to use them. With the RESTEasy workbook that follows, you get step-by-step instructions for installing, configuring, and running several working JAX-RS examples using the JBoss RESTEasy implementation of JAX-RS.Work on the design of a distributed RESTful interface, and develop it in Java as a JAX-RS serviceDispatch HTTP requests in JAX-RS, and learn how to extract information from themDeploy your web services within Java Enterprise Edition using the Application class, Default Component Model, EJB Integration, Spring Integration, and JPADiscover several options for securing your web servicesLearn how to implement RESTful design patterns using JAX-RSWrite RESTful clients in Java using libraries and frameworks such as java.net.URL, Apache HTTP Client, and RESTEasy Proxy