Book picks similar to
Apache Spark in 24 Hours, Sams Teach Yourself by Jeffrey Aven


data-science
big-data
educational
machine-learning-frameworks-spark

Architecting the Cloud: Design Decisions for Cloud Computing Service Models (Saas, Paas, and Iaas)


Michael J. Kavis - 2013
    However, before you can decide on a cloud model, you need to determine what the ideal cloud service model is for your business. Helping you cut through all the haze, Architecting the Cloud is vendor neutral and guides you in making one of the most critical technology decisions that you will face: selecting the right cloud service model(s) based on a combination of both business and technology requirements.Guides corporations through key cloud design considerations Discusses the pros and cons of each cloud service model Highlights major design considerations in areas such as security, data privacy, logging, data storage, SLA monitoring, and more Clearly defines the services cloud providers offer for each service model and the cloud services IT must provide Arming you with the information you need to choose the right cloud service provider, Architecting the Cloud is a comprehensive guide covering everything you need to be aware of in selecting the right cloud service model for you.

Hadoop in Action


Chuck Lam - 2010
    The intended readers are programmers, architects, and project managers who have to process large amounts of data offline. Hadoop in Action will lead the reader from obtaining a copy of Hadoop to setting it up in a cluster and writing data analytic programs.The book begins by making the basic idea of Hadoop and MapReduce easier to grasp by applying the default Hadoop installation to a few easy-to-follow tasks, such as analyzing changes in word frequency across a body of documents. The book continues through the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action.Hadoop in Action will explain how to use Hadoop and present design patterns and practices of programming MapReduce. MapReduce is a complex idea both conceptually and in its implementation, and Hadoop users are challenged to learn all the knobs and levers for running Hadoop. This book takes you beyond the mechanics of running Hadoop, teaching you to write meaningful programs in a MapReduce framework.This book assumes the reader will have a basic familiarity with Java, as most code examples will be written in Java. Familiarity with basic statistical concepts (e.g. histogram, correlation) will help the reader appreciate the more advanced data processing examples. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Data Smart: Using Data Science to Transform Information into Insight


John W. Foreman - 2013
    Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Archery: Steps to Success


Kathleen M. Haywood - 1989
    Describes the skills, techniques, and strategies to shoot safely, accurately, and consistently, using recurve or compound bows, in target and hunting situations. Through step-by-step, progressive instruction, each phase of the shot is covered: stance, draw, aim, release, and follow-through. New to this edition are full color photographs which clarify detailed written descriptions of proper techniques for shooting form, sighting and aiming, and anchoring. Drills and assessment exercises allow readers to apply and develop the physical and mental skills while a scoring system for each exercise promotes progressive skill development. "

A Whirlwind Tour of Python


Jake Vanderplas - 2016
    This report provides a brief yet comprehensive introduction to Python for engineers, researchers, and data scientists who are already familiar with another programming language.Author Jake VanderPlas, an interdisciplinary research director at the University of Washington, explains Python’s essential syntax and semantics, built-in data types and structures, function definitions, control flow statements, and more, using Python 3 syntax.You’ll explore:- Python syntax basics and running Python codeBasic semantics of Python variables, objects, and operators- Built-in simple types and data structures- Control flow statements for executing code blocks conditionally- Methods for creating and using reusable functionsIterators, list comprehensions, and generators- String manipulation and regular expressions- Python’s standard library and third-party modules- Python’s core data science tools- Recommended resources to help you learn more

Building Machine Learning Systems with Python


Willi Richert - 2013
    

Successful Business Intelligence: Secrets to Making BI a Killer App


Cindi Howson - 2007
    Learn about the components of a BI architecture, how to choose the appropriate tools and technologies, and how to roll out a BI strategy throughout the organisation.

The Art of Data Science: A Guide for Anyone Who Works with Data


Roger D. Peng - 2015
    The authors have extensive experience both managing data analysts and conducting their own data analyses, and have carefully observed what produces coherent results and what fails to produce useful insights into data. This book is a distillation of their experience in a format that is applicable to both practitioners and managers in data science.

Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

Data Science for Business: What you need to know about data mining and data-analytic thinking


Foster Provost - 2013
    This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates

Windows 8.1 For Dummies


Andy Rathbone - 2013
    Parts cover: Windows 8.1 Stuff Everybody Thinks You Already Know - an introduction to the dual interfaces, basic mechanics, file storage, and instruction on how to get the free upgrade to Windows 8.1.Working with Programs, Apps and Files - the basics of finding and launching apps, getting help, and printingGetting Things Done on the Internet - instructions for connecting a Windows 8.1 device, using web and social apps, and maintaining privacyCustomizing and Upgrading Windows 8.1 - Windows 8.1 offers big changes to what a user can customize on the OS. This section shows how to manipulate app tiles, give Windows the look you in, set up boot-to-desktop capabilities, connect to a network, and create user accounts.Music, Photos and Movies - Windows 8.1 offers new apps and capabilities for working with onboard and online media, all covered in this chapterHelp! - includes guidance on how to fix common problems, interpret strange messages, move files to a new PC, and use the built-in help systemThe Part of Tens - quick tips for avoiding common annoyances and working with Windows 8.1 on a touch device

Think Stats


Allen B. Downey - 2011
    This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data

PADI Open Water Diver Manual Revised 2010 Version


PADI - 2010
    The PADI Open Water Diver course leads to two possible certifications: PADI SCUBA Diver and PADI Open Water Diver. This book covers basic diving certification topics and techniques including:Choosing, using, maintaining, and storing equipment.Basic training, from pool to open waterDiving physiology, including buoyancy, behavior of gases, the bends, and hypothermia.Dive planning, including decompression dives.Safety and first aid.

Analyzing the Analyzers


Harlan Harris - 2013
    

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!