Book picks similar to
Spark: The Definitive Guide: Big Data Processing Made Simple by Bill Chambers
big-data
computer-science
data-science
data
A Guide to the Project Management Body of Knowledge (PMBOK® Guide)
Project Management Institute - 1995
This internationally recognized standard provides the essential tools to practice project management and deliver organizational results.
Deep Learning for Coders with Fastai and Pytorch: AI Applications Without a PhD
Jeremy Howard - 2020
But as this hands-on guide demonstrates, programmers comfortable with Python can achieve impressive results in deep learning with little math background, small amounts of data, and minimal code. How? With fastai, the first library to provide a consistent interface to the most frequently used deep learning applications.Authors Jeremy Howard and Sylvain Gugger show you how to train a model on a wide range of tasks using fastai and PyTorch. You'll also dive progressively further into deep learning theory to gain a complete understanding of the algorithms behind the scenes.Train models in computer vision, natural language processing, tabular data, and collaborative filteringLearn the latest deep learning techniques that matter most in practiceImprove accuracy, speed, and reliability by understanding how deep learning models workDiscover how to turn your models into web applicationsImplement deep learning algorithms from scratchConsider the ethical implications of your work
Effective Python: 90 Specific Ways to Write Better Python (Effective Software Development Series)
Brett Slatkin - 2019
However, Python’s unique strengths, charms, and expressiveness can be hard to grasp, and there are hidden pitfalls that can easily trip you up. This second edition of Effective Python will help you master a truly “Pythonic” approach to programming, harnessing Python’s full power to write exceptionally robust and well-performing code. Using the concise, scenario-driven style pioneered in Scott Meyers’ best-selling Effective C++, Brett Slatkin brings together 90 Python best practices, tips, and shortcuts, and explains them with realistic code examples so that you can embrace Python with confidence. Drawing on years of experience building Python infrastructure at Google, Slatkin uncovers little-known quirks and idioms that powerfully impact code behavior and performance. You’ll understand the best way to accomplish key tasks so you can write code that’s easier to understand, maintain, and improve. In addition to even more advice, this new edition substantially revises all items from the first edition to reflect how best practices have evolved. Key features include 30 new actionable guidelines for all major areas of Python Detailed explanations and examples of statements, expressions, and built-in types Best practices for writing functions that clarify intention, promote reuse, and avoid bugs Better techniques and idioms for using comprehensions and generator functions Coverage of how to accurately express behaviors with classes and interfaces Guidance on how to avoid pitfalls with metaclasses and dynamic attributes More efficient and clear approaches to concurrency and parallelism Solutions for optimizing and hardening to maximize performance and quality Techniques and built-in modules that aid in debugging and testing Tools and best practices for collaborative development Effective Python will prepare growing programmers to make a big impact using Python.
What Is Data Science?
Mike Loukides - 2011
Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.
Camel in Action
Claus Ibsen - 2010
It starts with core concepts like sending, receiving, routing, and transforming data and then shows readers the entire lifecycle. The book goes in depth on how to test, deal with errors, scale, deploy, and monitor apps and even how to build custom tooling. Written by core developers of Camel and the authors of the first edition, this book distills their experience and practical insights so that readers can tackle integration tasks like a pro.Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
The Art of Monitoring
James Turnbull - 2016
We start small and then build on what you learn to scale out to multi-site, multi-tier applications. The book is written for both developers and sysadmins. We focus on building monitored and measurable applications. We also use tools that are designed to handle the challenges of managing Cloud, containerised and distributed applications and infrastructure.In the book we'll deliver:* An introduction to monitoring, metrics and measurement.* A scalable framework for monitoring hosts (including Docker and containers), services and applications built on top of the Riemann event stream processor. * Graphing and metric storage using Graphite and Grafana.* Logging with Logstash.* A framework for high quality and useful notifications* Techniques for developing and building monitorable applications* A capstone that puts all the pieces together to monitor a multi-tier application.
Cloud Architecture Patterns: Using Microsoft Azure
Bill Wilder - 2012
You’ll learn how each of these platform-agnostic patterns work, when they might be useful in the cloud, and what impact they’ll have on your application architecture. You’ll also see an example of each pattern applied to an application built with Windows Azure.The patterns are organized into four major topics, such as scalability and handling failure, and primer chapters provide background on each topic. With the information in this book, you’ll be able to make informed decisions for designing effective cloud-native applications that maximize the value of cloud services, while also paying attention to user experience and operational efficiency.Learn about architectural patterns for:Scalability. Discover the advantages of horizontal scaling. Patterns covered include Horizontally Scaling Compute, Queue-Centric Workflow, and Auto-Scaling.Big data. Learn how to handle large amounts of data across a distributed system. Eventual consistency is explained, along with the MapReduce and Database Sharding patterns.Handling failure. Understand how multitenant cloud services and commodity hardware influence your applications. Patterns covered include Busy Signal and Node Failure.Distributed users. Learn how to overcome delays due to network latency when building applications for a geographically distributed user base. Patterns covered include Colocation, Valet Key, CDN, and Multi-Site Deployment.
Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
Matthew A. Russell - 2011
You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.Get a straightforward synopsis of the social web landscapeUse adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+Learn how to employ easy-to-use Python tools to slice and dice the data you collectExplore social connections in microformats with the XHTML Friends NetworkApply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detectionBuild interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data." --Alex Martelli, Senior Staff Engineer, Google
The Manager's Path: A Guide for Tech Leaders Navigating Growth and Change
Camille Fournier - 2017
Tech companies in general lack the experience, tools, texts, and frameworks to do it well. And the handful of books that share tips and tricks of engineering management don t explain how to supervise employees in the face of growth and change.In this book, author Camille Fournier takes you through the stages of technical management, from mentoring interns to working with the senior staff. You ll get actionable advice for approaching various obstacles in your path, whether you re a new manager, a mentor, or a more experienced leader looking for fresh advice. Pick up this book and learn how to become a better manager and leader in your organization. * Discover how to manage small teams and large/multi-level teams * Understand how to build and bootstrap a unifying culture in teams * Deal with people problems and learn how to mentor other managers and new leaders * Learn how to manage yourself: avoid common pitfalls that challenge many leaders * Obtain several practices that you can incorporate and practice along the way
Hackers: Heroes of the Computer Revolution
Steven Levy - 1984
That was before one pioneering work documented the underground computer revolution that was about to change our world forever. With groundbreaking profiles of Bill Gates, Steve Wozniak, MIT's Tech Model Railroad Club, and more, Steven Levy's Hackers brilliantly captured a seminal moment when the risk-takers and explorers were poised to conquer twentieth-century America's last great frontier. And in the Internet age, the hacker ethic-first espoused here-is alive and well.
Programming Scala: Scalability = Functional Programming + Objects
Dean Wampler - 2009
With this book, you'll discover why Scala is ideal for highly scalable, component-based applications that support concurrency and distribution.Programming Scala clearly explains the advantages of Scala as a JVM language. You'll learn how to leverage the wealth of Java class libraries to meet the practical needs of enterprise and Internet projects more easily. Packed with code examples, this book provides useful information on Scala's command-line tools, third-party tools, libraries, and available language-aware plugins for editors and IDEs.Learn how Scala's succinct and flexible code helps you program fasterDiscover the notable improvements Scala offers over Java's object modelGet a concise overview of functional programming, and learn how Scala's support for it offers a better approach to concurrencyKnow how to use mixin composition with traits, pattern matching, concurrency with Actors, and other essential featuresTake advantage of Scala's built-in support for XMLLearn how to develop domain-specific languagesUnderstand the basics for designing test-driven Scala applications
Test-Driven Development: By Example
Kent Beck - 2002
While some fear is healthy (often viewed as a conscience that tells programmers to be careful!), the author believes that byproducts of fear include tentative, grumpy, and uncommunicative programmers who are unable to absorb constructive criticism. When programming teams buy into TDD, they immediately see positive results. They eliminate the fear involved in their jobs, and are better equipped to tackle the difficult challenges that face them. TDD eliminates tentative traits, it teaches programmers to communicate, and it encourages team members to seek out criticism However, even the author admits that grumpiness must be worked out individually! In short, the premise behind TDD is that code should be continually tested and refactored. Kent Beck teaches programmers by example, so they can painlessly and dramatically increase the quality of their work.
Designing Web APIs: Building APIs That Developers Love
Brenda Jin - 2018
But building a popular API with a thriving developer ecosystem is also one of the most challenging. With this practical guide, developers, architects, and tech leads will learn how to navigate complex decisions for designing, scaling, marketing, and evolving interoperable APIs.Authors Brenda Jin, Saurabh Sahni, and Amir Shevat explain API design theory and provide hands-on exercises for building your web API and managing its operation in production. You'll also learn how to build and maintain a following of app developers. This book includes expert advice, worksheets, checklists, and case studies from companies including Slack, Stripe, Facebook, Microsoft, Cloudinary, Oracle, and GitHub.Get an overview of request-response and event-driven API design paradigmsLearn best practices for designing an API that meets the needs of your usersUse a template to create an API design processScale your web API to support a growing number of API calls and use casesRegularly adapt the API to reflect changes to your product or businessProvide developer resources that include API documentation, samples, and tools
Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation
Jez Humble - 2010
This groundbreaking new book sets out the principles and technical practices that enable rapid, incremental delivery of high quality, valuable new functionality to users. Through automation of the build, deployment, and testing process, and improved collaboration between developers, testers, and operations, delivery teams can get changes released in a matter of hours-- sometimes even minutes-no matter what the size of a project or the complexity of its code base. Jez Humble and David Farley begin by presenting the foundations of a rapid, reliable, low-risk delivery process. Next, they introduce the "deployment pipeline," an automated process for managing all changes, from check-in to release. Finally, they discuss the "ecosystem" needed to support continuous delivery, from infrastructure, data and configuration management to governance. The authors introduce state-of-the-art techniques, including automated infrastructure management and data migration, and the use of virtualization. For each, they review key issues, identify best practices, and demonstrate how to mitigate risks. Coverage includes - Automating all facets of building, integrating, testing, and deploying software - Implementing deployment pipelines at team and organizational levels - Improving collaboration between developers, testers, and operations - Developing features incrementally on large and distributed teams - Implementing an effective configuration management strategy - Automating acceptance testing, from analysis to implementation - Testing capacity and other non-functional requirements - Implementing continuous deployment and zero-downtime releases - Managing infrastructure, data, components and dependencies - Navigating risk management, compliance, and auditing Whether you're a developer, systems administrator, tester, or manager, this book will help your organization move from idea to release faster than ever--so you can deliver value to your business rapidly and reliably.
Star Schema the Complete Reference
Christopher Adamson - 2010
Star Schema: The Complete Reference offers in-depth coverage of design principles and their underlying rationales. Organized around design concepts and illustrated with detailed examples, this is a step-by-step guidebook for beginners and a comprehensive resource for experts.This all-inclusive volume begins with dimensional design fundamentals and shows how they fit into diverse data warehouse architectures, including those of W.H. Inmon and Ralph Kimball. The book progresses through a series of advanced techniques that help you address real-world complexity, maximize performance, and adapt to the requirements of BI and ETL software products. You are furnished with design tasks and deliverables that can be incorporated into any project, regardless of architecture or methodology.Master the fundamentals of star schema design and slow change processingIdentify situations that call for multiple stars or cubesEnsure compatibility across subject areas as your data warehouse growsAccommodate repeating attributes, recursive hierarchies, and poor data qualitySupport conflicting requirements for historic dataHandle variation within a business process and correlation of disparate activitiesBoost performance using derived schemas and aggregatesLearn when it's appropriate to adjust designs for BI and ETL tools