Data Science from Scratch: First Principles with Python


Joel Grus - 2015
    In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

REST in Practice: Hypermedia and Systems Architecture


Jim Webber - 2010
    You'll learn techniques for implementing specific Web technologies and patterns to solve the needs of a typical company as it grows from modest beginnings to become a global enterprise.Learn basic Web techniques for application integrationUse HTTP and the Web’s infrastructure to build scalable, fault-tolerant enterprise applicationsDiscover the Create, Read, Update, Delete (CRUD) pattern for manipulating resourcesBuild RESTful services that use hypermedia to model state transitions and describe business protocolsLearn how to make Web-based solutions secure and interoperableExtend integration patterns for event-driven computing with the Atom Syndication Format and implement multi-party interactions in AtomPubUnderstand how the Semantic Web will impact systems design

Hadoop: The Definitive Guide


Tom White - 2009
    Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you:Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk." -- Doug Cutting, Hadoop Founder, Yahoo!

Introduction to Algorithms


Thomas H. Cormen - 1989
    Each chapter is relatively self-contained and can be used as a unit of study. The algorithms are described in English and in a pseudocode designed to be readable by anyone who has done a little programming. The explanations have been kept elementary without sacrificing depth of coverage or mathematical rigor.

Understanding Computation: From Simple Machines to Impossible Programs


Tom Stuart - 2013
    Understanding Computation explains theoretical computer science in a context you’ll recognize, helping you appreciate why these ideas matter and how they can inform your day-to-day programming.Rather than use mathematical notation or an unfamiliar academic programming language like Haskell or Lisp, this book uses Ruby in a reductionist manner to present formal semantics, automata theory, and functional programming with the lambda calculus. It’s ideal for programmers versed in modern languages, with little or no formal training in computer science.* Understand fundamental computing concepts, such as Turing completeness in languages* Discover how programs use dynamic semantics to communicate ideas to machines* Explore what a computer can do when reduced to its bare essentials* Learn how universal Turing machines led to today’s general-purpose computers* Perform complex calculations, using simple languages and cellular automata* Determine which programming language features are essential for computation* Examine how halting and self-referencing make some computing problems unsolvable* Analyze programs by using abstract interpretation and type systems

A Bug Hunter's Diary: A Guided Tour Through the Wilds of Software Security


Tobias Klein - 2011
    In this one-of-a-kind account, you'll see how the developers responsible for these flaws patched the bugs—or failed to respond at all. As you follow Klein on his journey, you'll gain deep technical knowledge and insight into how hackers approach difficult problems and experience the true joys (and frustrations) of bug hunting.Along the way you'll learn how to:Use field-tested techniques to find bugs, like identifying and tracing user input data and reverse engineering Exploit vulnerabilities like NULL pointer dereferences, buffer overflows, and type conversion flaws Develop proof of concept code that verifies the security flaw Report bugs to vendors or third party brokersA Bug Hunter's Diary is packed with real-world examples of vulnerable code and the custom programs used to find and test bugs. Whether you're hunting bugs for fun, for profit, or to make the world a safer place, you'll learn valuable new skills by looking over the shoulder of a professional bug hunter in action.

xUnit Test Patterns: Refactoring Test Code


Gerard Meszaros - 2003
    An effective testing strategy will deliver new functionality more aggressively, accelerate user feedback, and improve quality. However, for many developers, creating effective automated tests is a unique and unfamiliar challenge. xUnit Test Patterns is the definitive guide to writing automated tests using xUnit, the most popular unit testing framework in use today. Agile coach and test automation expert Gerard Meszaros describes 68 proven patterns for making tests easier to write, understand, and maintain. He then shows you how to make them more robust and repeatable--and far more cost-effective. Loaded with information, this book feels like three books in one. The first part is a detailed tutorial on test automation that covers everything from test strategy to in-depth test coding. The second part, a catalog of 18 frequently encountered "test smells," provides trouble-shooting guidelines to help you determine the root cause of problems and the most applicable patterns. The third part contains detailed descriptions of each pattern, including refactoring instructions illustrated by extensive code samples in multiple programming languages. Topics covered includeWriting better tests--and writing them faster The four phases of automated tests: fixture setup, exercising the system under test, result verification, and fixture teardown Improving test coverage by isolating software from its environment using Test Stubs and Mock Objects Designing software for greater testability Using test "smells" (including code smells, behavior smells, and project smells) to spot problems and know when and how to eliminate them Refactoring tests for greater simplicity, robustness, and execution speed This book will benefit developers, managers, and testers working with any agile or conventional development process, whether doing test-driven development or writing the tests last. While the patterns and smells are especially applicable to all members of the xUnit family, they also apply to next-generation behavior-driven development frameworks such as RSpec and JBehave and to other kinds of test automation tools, including recorded test tools and data-driven test tools such as Fit and FitNesse.Visual Summary of the Pattern Language Foreword Preface Acknowledgments Introduction Refactoring a Test PART I: The Narratives Chapter 1 A Brief Tour Chapter 2 Test Smells Chapter 3 Goals of Test Automation Chapter 4 Philosophy of Test Automation Chapter 5 Principles of Test Automation Chapter 6 Test Automation Strategy Chapter 7 xUnit Basics Chapter 8 Transient Fixture Management Chapter 9 Persistent Fixture Management Chapter 10 Result Verification Chapter 11 Using Test Doubles Chapter 12 Organizing Our Tests Chapter 13 Testing with Databases Chapter 14 A Roadmap to Effective Test Automation PART II: The Test Smells Chapter 15 Code Smells Chapter 16 Behavior Smells Chapter 17 Project Smells PART III: The Patterns Chapter 18 Test Strategy Patterns Chapter 19 xUnit Basics Patterns Chapter 20 Fixture Setup Patterns Chapter 21 Result Verification Patterns Chapter 22 Fixture Teardown Patterns Chapter 23 Test Double Patterns Chapter 24 Test Organization Patterns Chapter 25 Database Patterns Chapter 26 Design-for-Testability Patterns Chapter 27 Value Patterns PART IV: Appendixes Appendix A Test Refactorings Appendix B xUnit Terminology Appendix C xUnit Family Members Appendix D Tools Appendix E Goals and Principles Appendix F Smells, Aliases, and Causes Appendix G Patterns, Aliases, and Variations Glossary References Index "

Head First Java


Kathy Sierra - 2005
    You might think the problem is your brain. It seems to have a mind of its own, a mind that doesn't always want to take in the dry, technical stuff you're forced to study. The fact is your brain craves novelty. It's constantly searching, scanning, waiting for something unusual to happen. After all, that's the way it was built to help you stay alive. It takes all the routine, ordinary, dull stuff and filters it to the background so it won't interfere with your brain's real work--recording things that matter. How does your brain know what matters? It's like the creators of the Head First approach say, suppose you're out for a hike and a tiger jumps in front of you, what happens in your brain? Neurons fire. Emotions crank up. Chemicals surge. That's how your brain knows.And that's how your brain will learn Java. Head First Java combines puzzles, strong visuals, mysteries, and soul-searching interviews with famous Java objects to engage you in many different ways. It's fast, it's fun, and it's effective. And, despite its playful appearance, Head First Java is serious stuff: a complete introduction to object-oriented programming and Java. You'll learn everything from the fundamentals to advanced topics, including threads, network sockets, and distributed programming with RMI. And the new. second edition focuses on Java 5.0, the latest version of the Java language and development platform. Because Java 5.0 is a major update to the platform, with deep, code-level changes, even more careful study and implementation is required. So learning the Head First way is more important than ever. If you've read a Head First book, you know what to expect--a visually rich format designed for the way your brain works. If you haven't, you're in for a treat. You'll see why people say it's unlike any other Java book you've ever read.By exploiting how your brain works, Head First Java compresses the time it takes to learn and retain--complex information. Its unique approach not only shows you what you need to know about Java syntax, it teaches you to think like a Java programmer. If you want to be bored, buy some other book. But if you want to understand Java, this book's for you.

Continuous Integration: Improving Software Quality and Reducing Risk


Paul Duvall - 2007
    The key, as the authors show, is to integrate regularly and often using continuous integration (CI) practices and techniques. The authors first examine the concept of CI and its practices from the ground up and then move on to explore other effective processes performed by CI systems, such as database integration, testing, inspection, deployment, and feedback. Through more than forty CI-related practices using application examples in different languages, readers learn that CI leads to more rapid software development, produces deployable software at every step in the development lifecycle, and reduces the time between defect introduction and detection, saving time and lowering costs. With successful implementation of CI, developers reduce risks and repetitive manual processes, and teams receive better project visibility. The book covers How to make integration a "non-event" on your software development projects How to reduce the amount of repetitive processes you perform when building your software Practices and techniques for using CI effectively with your teams Reducing the risks of late defect discovery, low-quality software, lack of visibility, and lack of deployable software Assessments of different CI servers and related tools on the market The book's companion Web site, www.integratebutton.com, provides updates and code examples

Python for Data Analysis


Wes McKinney - 2011
    It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

High Performance Web Sites


Steve Souders - 2007
    Author Steve Souders, in his job as Chief Performance Yahoo!, collected these best practices while optimizing some of the most-visited pages on the Web. Even sites that had already been highly optimized, such as Yahoo! Search and the Yahoo! Front Page, were able to benefit from these surprisingly simple performance guidelines.The rules in High Performance Web Sites explain how you can optimize the performance of the Ajax, CSS, JavaScript, Flash, and images that you've already built into your site -- adjustments that are critical for any rich web application. Other sources of information pay a lot of attention to tuning web servers, databases, and hardware, but the bulk of display time is taken up on the browser side and by the communication between server and browser. High Performance Web Sites covers every aspect of that process.Each performance rule is supported by specific examples, and code snippets are available on the book's companion web site. The rules include how to: Make Fewer HTTP RequestsUse a Content Delivery NetworkAdd an Expires HeaderGzip ComponentsPut Stylesheets at the TopPut Scripts at the BottomAvoid CSS ExpressionsMake JavaScript and CSS ExternalReduce DNS LookupsMinify JavaScriptAvoid RedirectsRemove Duplicates ScriptsConfigure ETagsMake Ajax CacheableIf you're building pages for high traffic destinations and want to optimize the experience of users visiting your site, this book is indispensable.If everyone would implement just 20% of Steve's guidelines, the Web would be adramatically better place. Between this book and Steve's YSlow extension, there's reallyno excuse for having a sluggish web site anymore.-Joe Hewitt, Developer of Firebug debugger and Mozilla's DOM InspectorSteve Souders has done a fantastic job of distilling a massive, semi-arcane art down to a set of concise, actionable, pragmatic engineering steps that will change the world of web performance.-Eric Lawrence, Developer of the Fiddler Web Debugger, Microsoft Corporation

The Mythical Man-Month: Essays on Software Engineering


Frederick P. Brooks Jr. - 1975
    With a blend of software engineering facts and thought-provoking opinions, Fred Brooks offers insight for anyone managing complex projects. These essays draw from his experience as project manager for the IBM System/360 computer family and then for OS/360, its massive software system. Now, 45 years after the initial publication of his book, Brooks has revisited his original ideas and added new thoughts and advice, both for readers already familiar with his work and for readers discovering it for the first time.The added chapters contain (1) a crisp condensation of all the propositions asserted in the original book, including Brooks' central argument in The Mythical Man-Month: that large programming projects suffer management problems different from small ones due to the division of labor; that the conceptual integrity of the product is therefore critical; and that it is difficult but possible to achieve this unity; (2) Brooks' view of these propositions a generation later; (3) a reprint of his classic 1986 paper "No Silver Bullet"; and (4) today's thoughts on the 1986 assertion, "There will be no silver bullet within ten years."

Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design


Michael J. Hernandez - 1996
    You d be up to your neck in normal forms before you even had a chance to wade. When Michael J. Hernandez needed a database design book to teach mere mortals like himself, there were none. So he began a personal quest to learn enough to write one. And he did.Now in its Second Edition, Database Design for Mere Mortals is a miracle for today s generation of database users who don t have the background -- or the time -- to learn database design the hard way. It s also a secret pleasure for working pros who are occasionally still trying to figure out what they were taught.Drawing on 13 years of database teaching experience, Hernandez has organized database design into several key principles that are surprisingly easy to understand and remember. He illuminates those principles using examples that are generic enough to help you with virtually any application.Hernandez s goals are simple. You ll learn how to create a sound database structure as easily as possible. You ll learn how to optimize your structure for efficiency and data integrity. You ll learn how to avoid problems like missing, incorrect, mismatched, or inaccurate data. You ll learn how to relate tables together to make it possible to get whatever answers you need in the future -- even if you haven t thought of the questions yet.If -- as is often the case -- you already have a database, Hernandez explains how to analyze it -- and leverage it. You ll learn how to identify new information requirements, determine new business rules that need to be applied, and apply them.Hernandez starts with an introduction to databases, relational databases, and the idea and objectives of database design. Next, you ll walk through the key elements of the database design process: establishing table structures and relationships, assigning primary keys, setting field specifications, and setting up views. Hernandez s extensive coverage of data integrity includes a full chapter on establishing business rules and using validation tables.Hernandez surveys bad design techniques in a chapter on what not to do -- and finally, helps you identify those rare instances when it makes sense to bend or even break the conventional rules of database design.There s plenty that s new in this edition. Hernandez has gone over his text and illustrations with a fine-tooth comb to improve their already impressive clarity. You ll find updates to reflect new advances in technology, including web database applications. There are expanded and improved discussions of nulls and many-to-many relationships; multivalued fields; primary keys; and SQL data type fields. There s a new Quick Reference database design flowchart. A new glossary. New review questions at the end of every chapter.Finally, it s worth mentioning what this book isn t. It isn t a guide to any specific database platform -- so you can use it whether you re running Access, SQL Server, or Oracle, MySQL or PostgreSQL. And it isn t an SQL guide. (If that s what you need, Michael J. Hernandez has also coauthored the superb SQL Queries for Mere Mortals). But if database design is what you need to learn, this book s worth its weight in gold. Bill CamardaBill Camarda is a consultant, writer, and web/multimedia content developer. His 15 books include Special Edition Using Word 2000 and Upgrading & Fixing Networks for Dummies, Second Edition.

The Psychology of Computer Programming


Gerald M. Weinberg - 1971
    Weinberg adds new insights and highlights the similarities and differences between now and then. Using a conversational style that invites the reader to join him, Weinberg reunites with some of his most insightful writings on the human side of software engineering.Topics include egoless programming, intelligence, psychological measurement, personality factors, motivation, training, social problems on large projects, problem-solving ability, programming language design, team formation, the programming environment, and much more.Dorset House Publishing is proud to make this important text available to new generations of programmers -- and to encourage readers of the first edition to return to its valuable lessons.

Mastering Bitcoin: Unlocking Digital Cryptocurrencies


Andreas M. Antonopoulos - 2014
    Whether you're building the next killer app, investing in a startup, or simply curious about the technology, this practical book is essential reading.Bitcoin, the first successful decentralized digital currency, is still in its infancy and it's already spawned a multi-billion dollar global economy. This economy is open to anyone with the knowledge and passion to participate. Mastering Bitcoin provides you with the knowledge you need (passion not included).This book includes:A broad introduction to bitcoin--ideal for non-technical users, investors, and business executivesAn explanation of the technical foundations of bitcoin and cryptographic currencies for developers, engineers, and software and systems architectsDetails of the bitcoin decentralized network, peer-to-peer architecture, transaction lifecycle, and security principlesOffshoots of the bitcoin and blockchain inventions, including alternative chains, currencies, and applicationsUser stories, analogies, examples, and code snippets illustrating key technical concepts