Book picks similar to
Data Analysis Using SQL and Excel by Gordon S. Linoff
data-science
analytics
data-analysis
reference
Groovy in Action
Dierk König - 2007
Groovy in Action is a comprehensive guide to Groovy programming, introducing Java developers to the new dynamic features that Groovy provides. To bring you Groovy in Action, Manning again went to the source by working with a team of expert authors including both members and the Manager of the Groovy Project team. The result is the true definitive guide to the new Groovy language.Groovy in Action introduces Groovy by example, presenting lots of reusable code while explaining the underlying concepts. Java developers new to Groovy find a smooth transition into the dynamic programming world. Groovy experts gain a solid reference that challenges them to explore Groovy deeply and creatively.Because Groovy is so new, most readers will be learning it from scratch. Groovy in Action quickly moves through the Groovy basics, including:Simple and collective Groovy data types Working with Closures and Groovy Control Structures Dynamic Object Orientation, Groovy styleReaders are presented with rich and detailed examples illustrating Groovy's enhancements to Java, includingHow to Work with Builders and the GDK Database programming with GroovyGroovy in Action then demonstrates how to Integrate Groovy with XML, and provides:Tips and Tricks Unit Testing and Build Support Groovy on WindowsAn additional bonus is a chapter dedicated to Grails, the Groovy Web Application Framework.Purchase of the print book comes with an offer of a free PDF eBook from Manning. Also available is all code from the book.
Kubernetes: Up & Running
Kelsey Hightower - 2016
How's that possible? Google revealed the secret through a project called Kubernetes, an open source cluster orchestrator (based on its internal Borg system) that radically simplifies the task of building, deploying, and maintaining scalable distributed systems in the cloud. This practical guide shows you how Kubernetes and container technology can help you achieve new levels of velocity, agility, reliability, and efficiency.Authors Kelsey Hightower, Brendan Burns, and Joe Beda--who've worked on Kubernetes at Google--explain how this system fits into the lifecycle of a distributed application. You will learn how to use tools and APIs to automate scalable distributed systems, whether it is for online services, machine-learning applications, or a cluster of Raspberry Pi computers.Explore the distributed system challenges that Kubernetes addressesDive into containerized application development, using containers such as DockerCreate and run containers on Kubernetes, using Docker's Image format and container runtimeExplore specialized objects essential for running applications in productionReliably roll out new software versions without downtime or errorsGet examples of how to develop and deploy real-world applications in Kubernetes
Planning for Big Data
Edd Wilder-James - 2004
From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work
Seven Languages in Seven Weeks
Bruce A. Tate - 2010
But if one per year is good, how about Seven Languages in Seven Weeks? In this book you'll get a hands-on tour of Clojure, Haskell, Io, Prolog, Scala, Erlang, and Ruby. Whether or not your favorite language is on that list, you'll broaden your perspective of programming by examining these languages side-by-side. You'll learn something new from each, and best of all, you'll learn how to learn a language quickly. Ruby, Io, Prolog, Scala, Erlang, Clojure, Haskell. With Seven Languages in Seven Weeks, by Bruce A. Tate, you'll go beyond the syntax-and beyond the 20-minute tutorial you'll find someplace online. This book has an audacious goal: to present a meaningful exploration of seven languages within a single book. Rather than serve as a complete reference or installation guide, Seven Languages hits what's essential and unique about each language. Moreover, this approach will help teach you how to grok new languages. For each language, you'll solve a nontrivial problem, using techniques that show off the language's most important features. As the book proceeds, you'll discover the strengths and weaknesses of the languages, while dissecting the process of learning languages quickly--for example, finding the typing and programming models, decision structures, and how you interact with them. Among this group of seven, you'll explore the most critical programming models of our time. Learn the dynamic typing that makes Ruby, Python, and Perl so flexible and compelling. Understand the underlying prototype system that's at the heart of JavaScript. See how pattern matching in Prolog shaped the development of Scala and Erlang. Discover how pure functional programming in Haskell is different from the Lisp family of languages, including Clojure. Explore the concurrency techniques that are quickly becoming the backbone of a new generation of Internet applications. Find out how to use Erlang's let-it-crash philosophy for building fault-tolerant systems. Understand the actor model that drives concurrency design in Io and Scala. Learn how Clojure uses versioning to solve some of the most difficult concurrency problems. It's all here, all in one place. Use the concepts from one language to find creative solutions in another-or discover a language that may become one of your favorites.
Big Data Now: Current Perspectives from O'Reilly Radar
O'Reilly Radar Team - 2011
Mike Loukides kicked things off in June 2010 with “What is data science?” and from there we’ve pursued the various threads and themes that naturally emerged. Now, roughly a year later, we can look back over all we’ve covered and identify a number of core data areas: Data issues -- The opportunities and ambiguities of the data space are evident in discussions around privacy, the implications of data-centric industries, and the debate about the phrase “data science” itself. The application of data: products and processes – A “data product” can emerge from virtually any domain, including everything from data startups to established enterprises to media/journalism to education and research. Data science and data tools -- The tools and technologies that drive data science are of course essential to this space, but the varied techniques being applied are also key to understanding the big data arena.The business of data – Take a closer look at the actions connected to data -- the finding, organizing, and analyzing that provide organizations of all sizes with the information they need to compete.
Learning Spark: Lightning-Fast Big Data Analysis
Holden Karau - 2013
How can you work with it efficiently? Recently updated for Spark 1.3, this book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.
Written by the developers of Spark, this book will have data scientists and engineers up and running in no time. You’ll learn how to express parallel jobs with just a few lines of code, and cover applications from simple batch jobs to stream processing and machine learning.
Quickly dive into Spark capabilities such as distributed datasets, in-memory caching, and the interactive shell
Leverage Spark’s powerful built-in libraries, including Spark SQL, Spark Streaming, and MLlib
Use one programming paradigm instead of mixing and matching tools like Hive, Hadoop, Mahout, and Storm
Learn how to deploy interactive, batch, and streaming applications
Connect to data sources including HDFS, Hive, JSON, and S3
Master advanced topics like data partitioning and shared variables
The Developer's Code: What Real Programmers Do
Ka Wai Cheung - 2012
There are no trite superlatives here. Packed with lessons learned from more than a decade of software development experience, author Ka Wai Cheung takes you through the programming profession from nearly every angle to uncover ways of sustaining a healthy connection with your work. You'll see how to stay productive even on the longest projects. You'll create a workflow that works with you, not against you. And you'll learn how to deal with clients whose goals don't align with your own. If you don't handle them just right, issues such as these can crush even the most seasoned, motivated developer. But with the right approach, you can transcend these common problems and become the professional developer you want to be. In more than 50 nuggets of wisdom, you'll learn: Why many traditional approaches to process and development roles in this industry are wrong - and how to sniff them out. Why you must always say "no" to the software pet project and open-ended timelines. How to incorporate code generation into your development process, and why its benefits go far beyond just faster code output. What to do when your client or end user disagrees with an approach you believe in. How to pay your knowledge forward to future generations of programmers through teaching and evangelism. If you're in this industry for the long run, you'll be coming back to this book again and again.
Python Crash Course: A Hands-On, Project-Based Introduction to Programming
Eric Matthes - 2015
You'll also learn how to make your programs interactive and how to test your code safely before adding it to a project. In the second half of the book, you'll put your new knowledge into practice with three substantial projects: a Space Invaders-inspired arcade game, data visualizations with Python's super-handy libraries, and a simple web app you can deploy online.As you work through Python Crash Course, you'll learn how to: Use powerful Python libraries and tools, including matplotlib, NumPy, and PygalMake 2D games that respond to keypresses and mouse clicks, and that grow more difficult as the game progressesWork with data to generate interactive visualizationsCreate and customize simple web apps and deploy them safely onlineDeal with mistakes and errors so you can solve your own programming problemsIf you've been thinking seriously about digging into programming, Python Crash Course will get you up to speed and have you writing real programs fast. Why wait any longer? Start your engines and code!
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Pedro Domingos - 2015
In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.
Learning UML 2.0: A Pragmatic Introduction to UML
Russ Miles - 2006
Every integrated software development environment in the world--open-source, standards-based, and proprietary--now supports UML and, more importantly, the model-driven approach to software development. This makes learning the newest UML standard, UML 2.0, critical for all software developers--and there isn't a better choice than this clear, step-by-step guide to learning the language."--Richard Mark Soley, Chairman and CEO, OMGIf you're like most software developers, you're building systems that are increasingly complex. Whether you're creating a desktop application or an enterprise system, complexity is the big hairy monster you must manage.The Unified Modeling Language (UML) helps you manage this complexity. Whether you're looking to use UML as a blueprint language, a sketch tool, or as a programming language, this book will give you the need-to-know information on how to apply UML to your project. While there are plenty of books available that describe UML, Learning UML 2.0 will show you how to use it. Topics covered include:Capturing your system's requirements in your model to help you ensure that your designs meet your users' needsModeling the parts of your system and their relationshipsModeling how the parts of your system work together to meet your system's requirementsModeling how your system moves into the real world, capturing how your system will be deployedEngaging and accessible, this book shows you how to use UML to craft and communicate your project's design. Russ Miles and Kim Hamilton have written a pragmatic introduction to UML based on hard-earned practice, not theory. Regardless of the software process or methodology you use, this book is the one source you need to get up and running with UML 2.0.Russ Miles is a software engineer for General Dynamics UK, where he works with Java and Distributed Systems, although his passion at the moment is Aspect Orientation and, in particular, AspectJ. Kim Hamilton is a senior software engineer at Northrop Grumman, where she's designed and implemented a variety of systems including web applications and distributed systems, with frequent detours into algorithms development.
Implementing Domain-Driven Design
Vaughn Vernon - 2013
Vaughn Vernon couples guided approaches to implementation with modern architectures, highlighting the importance and value of focusing on the business domain while balancing technical considerations.Building on Eric Evans’ seminal book, Domain-Driven Design, the author presents practical DDD techniques through examples from familiar domains. Each principle is backed up by realistic Java examples–all applicable to C# developers–and all content is tied together by a single case study: the delivery of a large-scale Scrum-based SaaS system for a multitenant environment.The author takes you far beyond “DDD-lite” approaches that embrace DDD solely as a technical toolset, and shows you how to fully leverage DDD’s “strategic design patterns” using Bounded Context, Context Maps, and the Ubiquitous Language. Using these techniques and examples, you can reduce time to market and improve quality, as you build software that is more flexible, more scalable, and more tightly aligned to business goals.
Data Science at the Command Line: Facing the Future with Time-Tested Tools
Jeroen Janssens - 2014
You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms
User Stories Applied: For Agile Software Development
Mike Cohn - 2004
In User Stories Applied, Mike Cohn provides you with a front-to-back blueprint for writing these user stories and weaving them into your development lifecycle.You'll learn what makes a great user story, and what makes a bad one. You'll discover practical ways to gather user stories, even when you can't speak with your users. Then, once you've compiled your user stories, Cohn shows how to organize them, prioritize them, and use them for planning, management, and testing.User role modeling: understanding what users have in common, and where they differ Gathering stories: user interviewing, questionnaires, observation, and workshops Working with managers, trainers, salespeople and other proxies Writing user stories for acceptance testing Using stories to prioritize, set schedules, and estimate release costs Includes end-of-chapter practice questions and exercises User Stories Applied will be invaluable to every software developer, tester, analyst, and manager working with any agile method: XP, Scrum... or even your own home-grown approach.
Designing Data-Intensive Applications
Martin Kleppmann - 2015
Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures
Practical SQL: A Beginner's Guide to Storytelling with Data
Anthony DeBarros - 2022
An approachable guide to programming in SQL (Structured Query Language) that will teach even beginning programmers how to build powerful databases and analyze data to find meaningful information.Practical SQL is an approachable and fast-paced guide to SQL (Structured Query Language) written by longtime professional journalist Anthony DeBarros. SQL is the primary tool that programmers, web developers, researchers, journalists, and others use to explore data in a database. DeBarros focuses on using SQL to find the story in data, with the aid of the popular open-source database PostgreSQL and the pgAdmin interface.This thoroughly revised second edition includes a new chapter describing how to set up PostgreSQL and more extensive discussion of pgAdmin's best features. The author has also added a chapter on the JSON data format that shows readers how to store and query JSON data. DeBarros has also updated the data in the book throughout, added coverage of additional topics, and perfected the book's examples.Readers love DeBarros's use of exercises and real-world examples that demonstrate how to:- Create databases and related tables using your own data - Correctly define data typesAggregate, sort, and filter data to find patterns - Clean their data and transfer data as text files - Create advanced queries and automate tasksThis book uses PostgreSQL, but the SQL syntax is applicable to many database applications, including Microsoft SQL Server and MySQL.