Book picks similar to
Hadoop in Action by Chuck Lam


programming
data-science
big-data
hadoop

Spring in Action


Craig Walls - 2007
    

Algorithm Design


Jon Kleinberg - 2005
    The book teaches a range of design and analysis techniques for problems that arise in computing applications. The text encourages an understanding of the algorithm design process and an appreciation of the role of algorithms in the broader field of computer science.

A Practical Guide to Linux Commands, Editors, and Shell Programming


Mark G. Sobell - 2005
    The book is a complete revision of the commands section of Sobell's Practical Guide to Linux - a proven best-seller. The book is Linux distribution and release agnostic. It will appeal to users of ALL Linux distributions. Superior examples make this book the the best option on the market! System administrators, software developers, quality assurance engineers and others working on a Linux system need to work from the command line in order to be effective. Linux is famous for its huge number of command line utility programs, and the programs themselves are famous for their large numbers of options, switches, and configuration files. But the truth is that users will only use a limited (but still significant) number of these utilities on a recurring basis, and then only with a subset of the most important and useful options, switches and configuration files. This book cuts through all the noise and shows them which utilities are most useful, and which options most important. And it contains examples, lot's and lot's of examples. programmability. Utilities are designed, by default, to work wtih other utilities within shell programs as a way of automating system tasks. This book contains a superb introduction to Linux shell programming. And since shell programmers need to write their programs in text editors, this book covers the two most popular ones: vi and emacs.

The Art of Invisibility: The World's Most Famous Hacker Teaches You How to Be Safe in the Age of Big Brother and Big Data


Kevin D. Mitnick - 2017
    Consumer's identities are being stolen, and a person's every step is being tracked and stored. What once might have been dismissed as paranoia is now a hard truth, and privacy is a luxury few can afford or understand.In this explosive yet practical book, Kevin Mitnick illustrates what is happening without your knowledge--and he teaches you "the art of invisibility." Mitnick is the world's most famous--and formerly the Most Wanted--computer hacker. He has hacked into some of the country's most powerful and seemingly impenetrable agencies and companies, and at one point he was on a three-year run from the FBI. Now, though, Mitnick is reformed and is widely regarded as the expert on the subject of computer security. He knows exactly how vulnerabilities can be exploited and just what to do to prevent that from happening. In THE ART OF INVISIBILITY Mitnick provides both online and real life tactics and inexpensive methods to protect you and your family, in easy step-by-step instructions. He even talks about more advanced "elite" techniques, which, if used properly, can maximize your privacy. Invisibility isn't just for superheroes--privacy is a power you deserve and need in this modern age.

Linux Kernel Development


Robert Love - 2003
    The book details the major subsystems and features of the Linux kernel, including its design, implementation, and interfaces. It covers the Linux kernel with both a practical and theoretical eye, which should appeal to readers with a variety of interests and needs. The author, a core kernel developer, shares valuable knowledge and experience on the 2.6 Linux kernel. Specific topics covered include process management, scheduling, time management and timers, the system call interface, memory addressing, memory management, the page cache, the VFS, kernel synchronization, portability concerns, and debugging techniques. This book covers the most interesting features of the Linux 2.6 kernel, including the CFS scheduler, preemptive kernel, block I/O layer, and I/O schedulers. The third edition of Linux Kernel Development includes new and updated material throughout the book:An all-new chapter on kernel data structuresDetails on interrupt handlers and bottom halvesExtended coverage of virtual memory and memory allocationTips on debugging the Linux kernelIn-depth coverage of kernel synchronization and lockingUseful insight into submitting kernel patches and working with the Linux kernel community

Programming in Lua


Roberto Ierusalimschy - 2001
    Currently, Lua is being used in areas ranging from embedded systems to Web development and is widely spread in the game industry, where knowledge of Lua is an indisputable asset. "Programming in Lua" is the official book about the language, giving a solid base for any programmer who wants to use Lua. Authored by Roberto Ierusalimschy, the chief architect of the language, it covers all aspects of Lua 5---from the basics to its API with C---explaining how to make good use of its features and giving numerous code examples. "Programming in Lua" is targeted at people with some programming background, but does not assume any prior knowledge about Lua or other scripting languages. This Second Edition updates the text to Lua 5.1 and brings substantial new material, including numerous new examples, a detailed explanation of the new module system, and two new chapters centered on multiple states and garbage collection.

ZooKeeper: Distributed process coordination


Flavio Junqueira - 2013
    This practical guide shows how Apache ZooKeeper helps you manage distributed systems, so you can focus mainly on application logic. Even with ZooKeeper, implementing coordination tasks is not trivial, but this book provides good practices to give you a head start, and points out caveats that developers and administrators alike need to watch for along the way.In three separate sections, ZooKeeper contributors Flavio Junqueira and Benjamin Reed introduce the principles of distributed systems, provide ZooKeeper programming techniques, and include the information you need to administer this service.Learn how ZooKeeper solves common coordination tasksExplore the ZooKeeper API’s Java and C implementations and how they differUse methods to track and react to ZooKeeper state changesHandle failures of the network, application processes, and ZooKeeper itselfLearn about ZooKeeper’s trickier aspects dealing with concurrency, ordering, and configurationUse the Curator high-level interface for connection managementBecome familiar with ZooKeeper internals and administration tools

The C++ Programming Language


Bjarne Stroustrup - 1986
    For this special hardcover edition, two new appendixes on locales and standard library exception safety (also available at www.research.att.com/ bs/) have been added. The result is complete, authoritative coverage of the C++ language, its standard library, and key design techniques. Based on the ANSI/ISO C++ standard, The C++ Programming Language provides current and comprehensive coverage of all C++ language features and standard library components. For example:abstract classes as interfaces class hierarchies for object-oriented programming templates as the basis for type-safe generic software exceptions for regular error handling namespaces for modularity in large-scale software run-time type identification for loosely coupled systems the C subset of C++ for C compatibility and system-level work standard containers and algorithms standard strings, I/O streams, and numerics C compatibility, internationalization, and exception safety Bjarne Stroustrup makes C++ even more accessible to those new to the language, while adding advanced information and techniques that even expert C++ programmers will find invaluable.

Applied Cryptography: Protocols, Algorithms, and Source Code in C


Bruce Schneier - 1993
    … The book the National Security Agency wanted never to be published." –Wired Magazine "…monumental… fascinating… comprehensive… the definitive work on cryptography for computer programmers…" –Dr. Dobb's Journal"…easily ranks as one of the most authoritative in its field." —PC Magazine"…the bible of code hackers." –The Millennium Whole Earth CatalogThis new edition of the cryptography classic provides you with a comprehensive survey of modern cryptography. The book details how programmers and electronic communications professionals can use cryptography—the technique of enciphering and deciphering messages-to maintain the privacy of computer data. It describes dozens of cryptography algorithms, gives practical advice on how to implement them into cryptographic software, and shows how they can be used to solve security problems. Covering the latest developments in practical cryptographic techniques, this new edition shows programmers who design computer applications, networks, and storage systems how they can build security into their software and systems. What's new in the Second Edition? * New information on the Clipper Chip, including ways to defeat the key escrow mechanism * New encryption algorithms, including algorithms from the former Soviet Union and South Africa, and the RC4 stream cipher * The latest protocols for digital signatures, authentication, secure elections, digital cash, and more * More detailed information on key management and cryptographic implementations

Building Microservices: Designing Fine-Grained Systems


Sam Newman - 2014
    But developing these systems brings its own set of headaches. With lots of examples and practical advice, this book takes a holistic view of the topics that system architects and administrators must consider when building, managing, and evolving microservice architectures.Microservice technologies are moving quickly. Author Sam Newman provides you with a firm grounding in the concepts while diving into current solutions for modeling, integrating, testing, deploying, and monitoring your own autonomous services. You'll follow a fictional company throughout the book to learn how building a microservice architecture affects a single domain.Discover how microservices allow you to align your system design with your organization's goalsLearn options for integrating a service with the rest of your systemTake an incremental approach when splitting monolithic codebasesDeploy individual microservices through continuous integrationExamine the complexities of testing and monitoring distributed servicesManage security with user-to-service and service-to-service modelsUnderstand the challenges of scaling microservice architectures

Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Linux Bible


Christopher Negus - 2005
    Whether you're new to Linux or need a reliable update and reference, this is an excellent resource. Veteran bestselling author Christopher Negus provides a complete tutorial packed with major updates, revisions, and hands-on exercises so that you can confidently start using Linux today. Offers a complete restructure, complete with exercises, to make the book a better learning tool Places a strong focus on the Linux command line tools and can be used with all distributions and versions of Linux Features in-depth coverage of the tools that a power user and a Linux administrator need to get startedThis practical learning tool is ideal for anyone eager to set up a new Linux desktop system at home or curious to learn how to manage Linux server systems at work.

Rails Recipes


Chad Fowler - 2006
    How do you use it effectively? How do you harness the power? And, most important, how do you get high quality, real-world applications written?From the latest Ajax effects to time-saving automation tips for your development process, "Rails Recipes" will show you how the experts have already solved the problems you have.Use generators to automate repetitive coding tasks.Create sophisticated role-based authentication schemes.Add live search and live preview to your site.Run tests when anyone checks code in.How to create tagged data the right way.and many, many more...Owning "Rails Recipes" is like having the best Rails programmers sitting next to you while you code.

Data Science at the Command Line: Facing the Future with Time-Tested Tools


Jeroen Janssens - 2014
    You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms

Big Data: Principles and best practices of scalable realtime data systems


Nathan Marz - 2012
    As scale and demand increase, so does Complexity. Fortunately, scalability and simplicity are not mutually exclusive—rather than using some trendy technology, a different approach is needed. Big data systems use many machines working in parallel to store and process data, which introduces fundamental challenges unfamiliar to most developers.Big Data shows how to build these systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy to understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to use them in practice, and how to deploy and operate them once they're built.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.