Data Science at the Command Line: Facing the Future with Time-Tested Tools


Jeroen Janssens - 2014
    You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms

R in a Nutshell: A Desktop Quick Reference


Joseph Adler - 2009
    R in a Nutshell provides a quick and practical way to learn this increasingly popular open source language and environment. You'll not only learn how to program in R, but also how to find the right user-contributed R packages for statistical modeling, visualization, and bioinformatics.The author introduces you to the R environment, including the R graphical user interface and console, and takes you through the fundamentals of the object-oriented R language. Then, through a variety of practical examples from medicine, business, and sports, you'll learn how you can use this remarkable tool to solve your own data analysis problems.Understand the basics of the language, including the nature of R objectsLearn how to write R functions and build your own packagesWork with data through visualization, statistical analysis, and other methodsExplore the wealth of packages contributed by the R communityBecome familiar with the lattice graphics package for high-level data visualizationLearn about bioinformatics packages provided by Bioconductor"I am excited about this book. R in a Nutshell is a great introduction to R, as well as a comprehensive reference for using R in data analytics and visualization. Adler provides 'real world' examples, practical advice, and scripts, making it accessible to anyone working with data, not just professional statisticians."

MATLAB Programming for Engineers


Stephen J. Chapman - 1999
    The book teaches MATLAB as a technical programming language showing students how to write clean, efficient, and well-documented programs. It makes no pretense at being a complete description of all of MATLAB's hundreds of functions. Instead, it teaches students how to locate any desired function with MATLAB's extensive on line help facilities. Overall, students develop problem-solving skills and are equipped for future courses and careers using the power of MATLAB.

Finite-Dimensional Vector Spaces


Paul R. Halmos - 1947
    The presentation is never awkward or dry, as it sometimes is in other "modern" textbooks; it is as unconventional as one has come to expect from the author. The book contains about 350 well placed and instructive problems, which cover a considerable part of the subject. All in all, this is an excellent work, of equally high value for both student and teacher." Zentralblatt f�r Mathematik

Artificial Intelligence


Elaine Rich - 1983
    I. is explored and explained in this best selling text. Assuming no prior knowledge, it covers topics like neural networks and robotics. This text explores the range of problems which have been and remain to be solved using A. I. tools and techniques. The second half of this text is an excellent reference.

Network Science


Albert-László Barabási
    

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Storytelling with Data: A Data Visualization Guide for Business Professionals


Cole Nussbaumer Knaflic - 2015
    You'll discover the power of storytelling and the way to make data a pivotal point in your story. The lessons in this illuminative text are grounded in theory, but made accessible through numerous real-world examples--ready for immediate application to your next graph or presentation.Storytelling is not an inherent skill, especially when it comes to data visualization, and the tools at our disposal don't make it any easier. This book demonstrates how to go beyond conventional tools to reach the root of your data, and how to use your data to create an engaging, informative, compelling story. Specifically, you'll learn how to:Understand the importance of context and audience Determine the appropriate type of graph for your situation Recognize and eliminate the clutter clouding your information Direct your audience's attention to the most important parts of your data Think like a designer and utilize concepts of design in data visualization Leverage the power of storytelling to help your message resonate with your audience Together, the lessons in this book will help you turn your data into high impact visual stories that stick with your audience. Rid your world of ineffective graphs, one exploding 3D pie chart at a time. There is a story in your data--Storytelling with Data will give you the skills and power to tell it!

Think Like a Programmer: An Introduction to Creative Problem Solving


V. Anton Spraul - 2012
    In this one-of-a-kind text, author V. Anton Spraul breaks down the ways that programmers solve problems and teaches you what other introductory books often ignore: how to Think Like a Programmer. Each chapter tackles a single programming concept, like classes, pointers, and recursion, and open-ended exercises throughout challenge you to apply your knowledge. You'll also learn how to:Split problems into discrete components to make them easier to solve Make the most of code reuse with functions, classes, and libraries Pick the perfect data structure for a particular job Master more advanced programming tools like recursion and dynamic memory Organize your thoughts and develop strategies to tackle particular types of problems Although the book's examples are written in C++, the creative problem-solving concepts they illustrate go beyond any particular language; in fact, they often reach outside the realm of computer science. As the most skillful programmers know, writing great code is a creative art—and the first step in creating your masterpiece is learning to Think Like a Programmer.

Taming Text: How to Find, Organize, and Manipulate It


Grant S. Ingersoll - 2011
    This causes real problems for everyday users who need to make sense of all the information available, and for software engineers who want to make their text-based applications more useful and user-friendly. Whether building a search engine for a corporate website, automatically organizing email, or extracting important nuggets of information from the news, dealing with unstructured text can be daunting.Taming Text is a hands-on, example-driven guide to working with unstructured text in the context of real-world applications. It explores how to automatically organize text, using approaches such as full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. This book gives examples illustrating each of these topics, as well as the foundations upon which they are built.Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.

Information Dashboard Design: The Effective Visual Communication of Data


Stephen Few - 2006
    Although dashboards are potentially powerful, this potential is rarely realized. The greatest display technology in the world won't solve this if you fail to use effective visual design. And if a dashboard fails to tell you precisely what you need to know in an instant, you'll never use it, even if it's filled with cute gauges, meters, and traffic lights. Don't let your investment in dashboard technology go to waste.This book will teach you the visual design skills you need to create dashboards that communicate clearly, rapidly, and compellingly. Information Dashboard Design will explain how to:Avoid the thirteen mistakes common to dashboard design Provide viewers with the information they need quickly and clearly Apply what we now know about visual perception to the visual presentation of information Minimize distractions, cliches, and unnecessary embellishments that create confusion Organize business information to support meaning and usability Create an aesthetically pleasing viewing experience Maintain consistency of design to provide accurate interpretation Optimize the power of dashboard technology by pairing it with visual effectiveness Stephen Few has over 20 years of experience as an IT innovator, consultant, and educator. As Principal of the consultancy Perceptual Edge, Stephen focuses on data visualization for analyzing and communicating quantitative business information. He provides consulting and training services, speaks frequently at conferences, and teaches in the MBA program at the University of California in Berkeley. He is also the author of Show Me the Numbers: Designing Tables and Graphs to Enlighten. Visit his website at www.perceptualedge.com.

Lucene in Action


Erik Hatcher - 2004
    It describes how to index your data, including types you definitely need to know such as MS Word, PDF, HTML, and XML. It introduces you to searching, sorting, filtering, and highlighting search results.Lucene powers search in surprising placesWhat's Inside- How to integrate Lucene into your applications- Ready-to-use framework for rich document handling- Case studies including Nutch, TheServerSide, jGuru, etc.- Lucene ports to Perl, Python, C#/.Net, and C++- Sorting, filtering, term vectors, multiple, and remote index searching- The new SpanQuery family, extending query parser, hit collecting- Performance testing and tuning- Lucene add-ons (hit highlighting, synonym lookup, and others)

Structure and Interpretation of Computer Programs


Harold Abelson - 1984
    This long-awaited revision contains changes throughout the text. There are new implementations of most of the major programming systems in the book, including the interpreters and compilers, and the authors have incorporated many small changes that reflect their experience teaching the course at MIT since the first edition was published. A new theme has been introduced that emphasizes the central role played by different approaches to dealing with time in computational models: objects with state, concurrent programming, functional programming and lazy evaluation, and nondeterministic programming. There are new example sections on higher-order procedures in graphics and on applications of stream processing in numerical programming, and many new exercises. In addition, all the programs have been reworked to run in any Scheme implementation that adheres to the IEEE standard.

Introduction to Automata Theory, Languages, and Computation


John E. Hopcroft - 1979
    With this long-awaited revision, the authors continue to present the theory in a concise and straightforward manner, now with an eye out for the practical applications. They have revised this book to make it more accessible to today's students, including the addition of more material on writing proofs, more figures and pictures to convey ideas, side-boxes to highlight other interesting material, and a less formal writing style. Exercises at the end of each chapter, including some new, easier exercises, help readers confirm and enhance their understanding of the material. *NEW! Completely rewritten to be less formal, providing more accessibility to todays students. *NEW! Increased usage of figures and pictures to help convey ideas. *NEW! More detail and intuition provided for definitions and proofs. *NEW! Provides special side-boxes to present supplemental material that may be of interest to readers. *NEW! Includes more exercises, including many at a lower level. *NEW! Presents program-like notation for PDAs and Turing machines. *NEW! Increas

Principles of Electronic Communication Systems


Louis E. Frenzel - 1997
    Requiring only basic algebra and trigonometry, the new edition is notable for its readability, learning features and numerous full-color photos and illustrations. A systems approach is used to cover state-of-the-art communications technologies, to best reflect current industry practice. This edition contains greatly expanded and updated material on the Internet, cell phones, and wireless technologies. Practical skills like testing and troubleshooting are integrated throughout. A brand-new Laboratory & Activities Manual provides both hands-on experiments and a variety of other activities, reflecting the variety of skills now needed by technicians. A new Online Learning Center web site is available, with a wealth of learning resources for students. An Instructor Productivity Center CD-ROM features solutions to all problems, PowerPoint lessons, and ExamView test banks for each chapter.