Book picks similar to
Planning for Big Data by Edd Wilder-James
big-data
ebook
non-fiction
professional
Introduction to Information Retrieval
Christopher D. Manning - 2008
Written from a computer science perspective by three leading experts in the field, it gives an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections. All the important ideas are explained using examples and figures, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science. Based on feedback from extensive classroom experience, the book has been carefully structured in order to make teaching more natural and effective. Although originally designed as the primary text for a graduate or advanced undergraduate course in information retrieval, the book will also create a buzz for researchers and professionals alike.
NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence
Pramod J. Sadalage - 2012
Advocates of NoSQL databases claim they can be used to build systems that are more performant, scale better, and are easier to program." ""NoSQL Distilled" is a concise but thorough introduction to this rapidly emerging technology. Pramod J. Sadalage and Martin Fowler explain how NoSQL databases work and the ways that they may be a superior alternative to a traditional RDBMS. The authors provide a fast-paced guide to the concepts you need to know in order to evaluate whether NoSQL databases are right for your needs and, if so, which technologies you should explore further. The first part of the book concentrates on core concepts, including schemaless data models, aggregates, new distribution models, the CAP theorem, and map-reduce. In the second part, the authors explore architectural and design issues associated with implementing NoSQL. They also present realistic use cases that demonstrate NoSQL databases at work and feature representative examples using Riak, MongoDB, Cassandra, and Neo4j. In addition, by drawing on Pramod Sadalage's pioneering work, "NoSQL Distilled" shows how to implement evolutionary design with schema migration: an essential technique for applying NoSQL databases. The book concludes by describing how NoSQL is ushering in a new age of Polyglot Persistence, where multiple data-storage worlds coexist, and architects can choose the technology best optimized for each type of data access.
Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)
Jiawei Han - 2000
Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site
Purely Functional Data Structures
Chris Okasaki - 1996
However, data structures for these languages do not always translate well to functional languages such as Standard ML, Haskell, or Scheme. This book describes data structures from the point of view of functional languages, with examples, and presents design techniques that allow programmers to develop their own functional data structures. The author includes both classical data structures, such as red-black trees and binomial queues, and a host of new data structures developed exclusively for functional languages. All source code is given in Standard ML and Haskell, and most of the programs are easily adaptable to other functional languages. This handy reference for professional programmers working with functional languages can also be used as a tutorial or for self-study.
What's New in Java 7?
Madhusudhan Konda - 2011
Madhusudhan Konda provides an overview of these, including strings in switch statements, multi-catch exception handling, try-with-resource statements, the new File System API, extensions of the JVM, support for dynamically-typed languages, and the fork and join framework for task parallelism.
New Programmer's Survival Manual
Joshua Carter - 2011
You've got the programming chops, you're up on the latest tech, you're sitting at your workstation... now what? New Programmer's Survival Manual gives your career the jolt it needs to get going: essential industry skills to help you apply your raw programming talent and make a name for yourself. It's a no-holds-barred look at what
really
goes on in the office--and how to not only survive, but thrive in your first job and beyond. Programming at industry level requires new skills - you'll build programs that dwarf anything you've done on your own. This book introduces you to practices for working on large-scale, long-lived programs at a professional level of quality. You'll find out how to work efficiently with your current tools, and discover essential new tools. But the tools are only part of the story; you've got to get street-smart too. Succeeding in the corporate working environment requires its own savvy. You'll learn how to navigate the office, work with your teammates, and how to deal with other people outside of your department. You'll understand where you fit into the big picture and how you contribute to the company's success. You'll also get a candid look at the tougher aspects of the job: stress, conflict, and office politics. Finally, programming is a job you can do for the long haul. This book helps you look ahead to the years to come, and your future opportunities--either as a programmer or in another role you grow into. There's nothing quite like the satisfaction of shipping a product and knowing, "I built that." Whether you work on embedded systems or web-based applications, in trendy technologies or legacy systems, this book helps you get from raw skill to an accomplished professional.
SQL Pocket Guide
Jonathan Gennick - 2003
It's used to create and maintain database objects, place data into those objects, query the data, modify the data, and, finally, delete data that is no longer needed. Databases lie at the heart of many, if not most business applications. Chances are very good that if you're involved with software development, you're using SQL to some degree. And if you're using SQL, you should own a good reference or two.Now available in an updated second edition, our very popular "SQL Pocket Guide" is a major help to programmers, database administrators, and everyone who uses SQL in their day-to-day work. The "SQL Pocket Guide" is a concise reference to frequently used SQL statements and commonly used SQL functions. Not just an endless collection of syntax diagrams, this portable guide addresses the language's complexity head on and leads by example. The information in this edition has been updated to reflect the latest versions of the most commonly used SQL variants including: Oracle Database 10g, Release 2 (includingthe free Oracle Database 10g Express Edition (XE))Microsoft SQL Server 2005MySQL 5IBM DB2 8.2PostreSQL 8.1 database
The Art of Doing Science and Engineering: Learning to Learn
Richard Hamming - 1996
By presenting actual experiences and analyzing them as they are described, the author conveys the developmental thought processes employed and shows a style of thinking that leads to successful results is something that can be learned. Along with spectacular successes, the author also conveys how failures contributed to shaping the thought processes. Provides the reader with a style of thinking that will enhance a person's ability to function as a problem-solver of complex technical issues. Consists of a collection of stories about the author's participation in significant discoveries, relating how those discoveries came about and, most importantly, provides analysis about the thought processes and reasoning that took place as the author and his associates progressed through engineering problems.
How Google Tests Software
James A. Whittaker - 2012
Legendary testing expert James Whittaker, until recently a Google testing leader, and two top Google experts reveal exactly how Google tests software, offering brand-new best practices you can use even if you're not quite Google's size...yet! Breakthrough Techniques You Can Actually Use Discover 100% practical, amazingly scalable techniques for analyzing risk and planning tests...thinking like real users...implementing exploratory, black box, white box, and acceptance testing...getting usable feedback...tracking issues...choosing and creating tools...testing "Docs & Mocks," interfaces, classes, modules, libraries, binaries, services, and infrastructure...reviewing code and refactoring...using test hooks, presubmit scripts, queues, continuous builds, and more. With these techniques, you can transform testing from a bottleneck into an accelerator-and make your whole organization more productive!
Naked Statistics: Stripping the Dread from the Data
Charles Wheelan - 2012
How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.
Bandit Algorithms for Website Optimization
John Myles White - 2012
Author John Myles White shows you how this powerful class of algorithms can help you boost website traffic, convert visitors to customers, and increase many other measures of success.This is the first developer-focused book on bandit algorithms, which were previously described only in research papers. You’ll quickly learn the benefits of several simple algorithms—including the epsilon-Greedy, Softmax, and Upper Confidence Bound (UCB) algorithms—by working through code examples written in Python, which you can easily adapt for deployment on your own website.Learn the basics of A/B testing—and recognize when it’s better to use bandit algorithmsDevelop a unit testing framework for debugging bandit algorithmsGet additional code examples written in Julia, Ruby, and JavaScript with supplemental online materials
Kubernetes: Up & Running
Kelsey Hightower - 2016
How's that possible? Google revealed the secret through a project called Kubernetes, an open source cluster orchestrator (based on its internal Borg system) that radically simplifies the task of building, deploying, and maintaining scalable distributed systems in the cloud. This practical guide shows you how Kubernetes and container technology can help you achieve new levels of velocity, agility, reliability, and efficiency.Authors Kelsey Hightower, Brendan Burns, and Joe Beda--who've worked on Kubernetes at Google--explain how this system fits into the lifecycle of a distributed application. You will learn how to use tools and APIs to automate scalable distributed systems, whether it is for online services, machine-learning applications, or a cluster of Raspberry Pi computers.Explore the distributed system challenges that Kubernetes addressesDive into containerized application development, using containers such as DockerCreate and run containers on Kubernetes, using Docker's Image format and container runtimeExplore specialized objects essential for running applications in productionReliably roll out new software versions without downtime or errorsGet examples of how to develop and deploy real-world applications in Kubernetes
Mac OS X Snow Leopard: The Missing Manual
David Pogue - 2009
Fortunately, David Pogue is back, with the humor and expertise that have made this the #1 bestselling Mac book for eight years straight. You get all the answers with jargon-free introductions to:Big-ticket changes. A 64-bit overhaul. Faster everything. A rewritten Finder. Microsoft Exchange compatibility. All-new QuickTime Player. If Apple wrote it, this book covers it.Snow Leopard Spots. This book demystifies the hundreds of smaller enhancements, too, in all 50 programs that come with the Mac: Safari, Mail, iChat, Preview, Time Machine.Shortcuts. This must be the tippiest, trickiest Mac book ever written. Undocumented surprises await on every page.Power usage. Security, networking, build-your-own Services, file sharing with Windows, even Mac OS X's Unix chassis-this one witty, expert guide makes it all crystal clear.
The Practice of System and Network Administration
Thomas A. Limoncelli - 2001
Whether you use Linux, Unix, or Windows, this newly revised edition describes the essential practices previously handed down only from mentor to protege. This wonderfully lucid, often funny cornucopia of information introduces beginners to advanced frameworks valuable for their entire career, yet is structured to help even the most advanced experts through difficult projects.The book's four major sections build your knowledge with the foundational elements of system administration. These sections guide you through better techniques for upgrades and change management, catalog best practices for IT services, and explore various management topics. Chapters are divided into The Basics and The Icing. When you get the Basics right it makes every other aspect of the job easier--such as automating the right things first. The Icing sections contain all the powerful things that can be done on top of the basics to wow customers and managers.Inside, you'll find advice on topics such asThe key elements your networks and systems need in order to make all other services run better Building and running reliable, scalable services, including web, storage, email, printing, and remote access Creating and enforcing security policies Upgrading multiple hosts at one time without creating havoc Planning for and performing flawless scheduled maintenance windows Managing superior helpdesks and customer care Avoiding the -temporary fix- trap Building data centers that improve server uptime Designing networks for speed and reliability Web scaling and security issues Why building a backup system isn't about backups Monitoring what you have and predicting what you will need How technically oriented workers can maintain their job's technical focus (and avoid an unwanted management role) Technical management issues, including morale, organization building, coaching, and maintaining positive visibility Personal skill techniques, including secrets for getting more done each day, ethical dilemmas, managing your boss, and loving your job System administration salary negotiation It's no wonder the first edition received Usenix SAGE's 2005 Outstanding Achievement Award!This eagerly anticipated second edition updates this time-proven classic:Chapters reordered for easier navigationThousands of updates and clarifications based on reader feedbackPlus three entirely new chapters: Web Services, Data Storage, and Documentation