Data Science from Scratch: First Principles with Python
Joel Grus - 2015
In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.
If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.
Get a crash course in Python
Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science
Collect, explore, clean, munge, and manipulate data
Dive into the fundamentals of machine learning
Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering
Explore recommender systems, natural language processing, network analysis, MapReduce, and databases
Ejb 3 in Action
Debu Panda - 2007
This book builds on the contributions and strengths of seminal technologies like Spring, Hibernate, and TopLink.EJB 3 is the most important innovation introduced in Java EE 5.0. EJB 3 simplifies enterprise development, abandoning the complex EJB 2.x model in favor of a lightweight POJO framework. The new API represents a fresh perspective on EJB without sacrificing the mission of enabling business application developers to create robust, scalable, standards-based solutions.EJB 3 in Action is a fast-paced tutorial, geared toward helping you learn EJB 3 and the Java Persistence API quickly and easily. For newcomers to EJB, this book provides a solid foundation in EJB. For the developer moving to EJB 3 from EJB 2, this book addresses the changes both in the EJB API and in the way the developer should approach EJB and persistence.
UNIX and Linux System Administration Handbook
Evi Nemeth - 2010
This is one of those cases. The UNIX System Administration Handbook is one of the few books we ever measured ourselves against." -From the Foreword by Tim O'Reilly, founder of O'Reilly Media "This book is fun and functional as a desktop reference. If you use UNIX and Linux systems, you need this book in your short-reach library. It covers a bit of the systems' history but doesn't bloviate. It's just straightfoward information delivered in colorful and memorable fashion." -Jason A. Nunnelley"This is a comprehensive guide to the care and feeding of UNIX and Linux systems. The authors present the facts along with seasoned advice and real-world examples. Their perspective on the variations among systems is valuable for anyone who runs a heterogeneous computing facility." -Pat Parseghian The twentieth anniversary edition of the world's best-selling UNIX system administration book has been made even better by adding coverage of the leading Linux distributions: Ubuntu, openSUSE, and RHEL. This book approaches system administration in a practical way and is an invaluable reference for both new administrators and experienced professionals. It details best practices for every facet of system administration, including storage management, network design and administration, email, web hosting, scripting, software configuration management, performance analysis, Windows interoperability, virtualization, DNS, security, management of IT service organizations, and much more. UNIX(R) and Linux(R) System Administration Handbook, Fourth Edition, reflects the current versions of these operating systems: Ubuntu(R) LinuxopenSUSE(R) LinuxRed Hat(R) Enterprise Linux(R)Oracle America(R) Solaris(TM) (formerly Sun Solaris)HP HP-UX(R)IBM AIX(R)
The Practice of System and Network Administration
Thomas A. Limoncelli - 2001
Whether you use Linux, Unix, or Windows, this newly revised edition describes the essential practices previously handed down only from mentor to protege. This wonderfully lucid, often funny cornucopia of information introduces beginners to advanced frameworks valuable for their entire career, yet is structured to help even the most advanced experts through difficult projects.The book's four major sections build your knowledge with the foundational elements of system administration. These sections guide you through better techniques for upgrades and change management, catalog best practices for IT services, and explore various management topics. Chapters are divided into The Basics and The Icing. When you get the Basics right it makes every other aspect of the job easier--such as automating the right things first. The Icing sections contain all the powerful things that can be done on top of the basics to wow customers and managers.Inside, you'll find advice on topics such asThe key elements your networks and systems need in order to make all other services run better Building and running reliable, scalable services, including web, storage, email, printing, and remote access Creating and enforcing security policies Upgrading multiple hosts at one time without creating havoc Planning for and performing flawless scheduled maintenance windows Managing superior helpdesks and customer care Avoiding the -temporary fix- trap Building data centers that improve server uptime Designing networks for speed and reliability Web scaling and security issues Why building a backup system isn't about backups Monitoring what you have and predicting what you will need How technically oriented workers can maintain their job's technical focus (and avoid an unwanted management role) Technical management issues, including morale, organization building, coaching, and maintaining positive visibility Personal skill techniques, including secrets for getting more done each day, ethical dilemmas, managing your boss, and loving your job System administration salary negotiation It's no wonder the first edition received Usenix SAGE's 2005 Outstanding Achievement Award!This eagerly anticipated second edition updates this time-proven classic:Chapters reordered for easier navigationThousands of updates and clarifications based on reader feedbackPlus three entirely new chapters: Web Services, Data Storage, and Documentation
Java Generics and Collections: Speed Up the Java Development Process
Maurice Naftalin - 2006
Generics and the greatly expanded collection libraries have tremendously increased the power of Java 5 and Java 6. But they have also confused many developers who haven't known how to take advantage of these new features.Java Generics and Collections covers everything from the most basic uses of generics to the strangest corner cases. It teaches you everything you need to know about the collections libraries, so you'll always know which collection is appropriate for any given task, and how to use it.Topics covered include:• Fundamentals of generics: type parameters and generic methods• Other new features: boxing and unboxing, foreach loops, varargs• Subtyping and wildcards• Evolution not revolution: generic libraries with legacy clients and generic clients with legacy libraries• Generics and reflection• Design patterns for generics• Sets, Queues, Lists, Maps, and their implementations• Concurrent programming and thread safety with collections• Performance implications of different collectionsGenerics and the new collection libraries they inspired take Java to a new level. If you want to take your software development practice to a new level, this book is essential reading.Philip Wadler is Professor of Theoretical Computer Science at the University of Edinburgh, where his research focuses on the design of programming languages. He is a co-designer of GJ, work that became the basis for generics in Sun's Java 5.0.Maurice Naftalin is Technical Director at Morningside Light Ltd., a software consultancy in the United Kingdom. He has most recently served as an architect and mentor at NSB Retail Systems plc, and as the leader of the client development team of a major UK government social service system."A brilliant exposition of generics. By far the best book on the topic, it provides a crystal clear tutorial that starts with the basics and ends leaving the reader with a deep understanding of both the use and design of generics." Gilad Bracha, Java Generics Lead, Sun Microsystems
A Bug Hunter's Diary: A Guided Tour Through the Wilds of Software Security
Tobias Klein - 2011
In this one-of-a-kind account, you'll see how the developers responsible for these flaws patched the bugs—or failed to respond at all. As you follow Klein on his journey, you'll gain deep technical knowledge and insight into how hackers approach difficult problems and experience the true joys (and frustrations) of bug hunting.Along the way you'll learn how to:Use field-tested techniques to find bugs, like identifying and tracing user input data and reverse engineering Exploit vulnerabilities like NULL pointer dereferences, buffer overflows, and type conversion flaws Develop proof of concept code that verifies the security flaw Report bugs to vendors or third party brokersA Bug Hunter's Diary is packed with real-world examples of vulnerable code and the custom programs used to find and test bugs. Whether you're hunting bugs for fun, for profit, or to make the world a safer place, you'll learn valuable new skills by looking over the shoulder of a professional bug hunter in action.
Tubes: A Journey to the Center of the Internet
Andrew Blum - 2012
But what is it physically? And where is it really? Our mental map of the network is as blank as the map of the ocean that Columbus carried on his first Atlantic voyage. The Internet, its material nuts and bolts, is an unexplored territory. Until now.In Tubes, journalist Andrew Blum goes inside the Internet's physical infrastructure and flips on the lights, revealing an utterly fresh look at the online world we think we know. It is a shockingly tactile realm of unmarked compounds, populated by a special caste of engineer who pieces together our networks by hand; where glass fibers pulse with light and creaky telegraph buildings, tortuously rewired, become communication hubs once again. From the room in Los Angeles where the Internet first flickered to life to the caverns beneath Manhattan where new fiber-optic cable is buried; from the coast of Portugal, where a ten-thousand-mile undersea cable just two thumbs wide connects Europe and Africa, to the wilds of the Pacific Northwest, where Google, Microsoft, and Facebook have built monumental data centers—Blum chronicles the dramatic story of the Internet's development, explains how it all works, and takes the first-ever in-depth look inside its hidden monuments.This is a book about real places on the map: their sounds and smells, their storied pasts, their physical details, and the people who live there. For all the talk of the "placelessness" of our digital age, the Internet is as fixed in real, physical spaces as the railroad or telephone. You can map it and touch it, and you can visit it. Is the Internet in fact "a series of tubes" as Ted Stevens, the late senator from Alaska, once famously described it? How can we know the Internet's possibilities if we don't know its parts?Like Tracy Kidder's classic The Soul of a New Machine or Tom Vanderbilt's recent bestseller Traffic, Tubes combines on-the-ground reporting and lucid explanation into an engaging, mind-bending narrative to help us understand the physical world that underlies our digital lives.
HBase: The Definitive Guide
Lars George - 2011
As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away.
Discover how tight integration with Hadoop makes scalability with HBase easier
Distribute large datasets across an inexpensive cluster of commodity servers
Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs
Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more
Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs
Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks
The Problem with Software: Why Smart Engineers Write Bad Code
Adam Barr - 2018
As the size and complexity of commercial software have grown, the gap between academic computer science and industry has widened. It's an open secret that there is little engineering in software engineering, which continues to rely not on codified scientific knowledge but on intuition and experience.Barr, who worked as a programmer for more than twenty years, describes how the industry has evolved, from the era of mainframes and Fortran to today's embrace of the cloud. He explains bugs and why software has so many of them, and why today's interconnected computers offer fertile ground for viruses and worms. The difference between good and bad software can be a single line of code, and Barr includes code to illustrate the consequences of seemingly inconsequential choices by programmers. Looking to the future, Barr writes that the best prospect for improving software engineering is the move to the cloud. When software is a service and not a product, companies will have more incentive to make it good rather than "good enough to ship."
Think Stats
Allen B. Downey - 2011
This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python.You'll work with a case study throughout the book to help you learn the entire data analysis process—from collecting data and generating statistics to identifying patterns and testing hypotheses. Along the way, you'll become familiar with distributions, the rules of probability, visualization, and many other tools and concepts.Develop your understanding of probability and statistics by writing and testing codeRun experiments to test statistical behavior, such as generating samples from several distributionsUse simulations to understand concepts that are hard to grasp mathematicallyLearn topics not usually covered in an introductory course, such as Bayesian estimationImport data from almost any source using Python, rather than be limited to data that has been cleaned and formatted for statistics toolsUse statistical inference to answer questions about real-world data
The Developer's Code: What Real Programmers Do
Ka Wai Cheung - 2012
There are no trite superlatives here. Packed with lessons learned from more than a decade of software development experience, author Ka Wai Cheung takes you through the programming profession from nearly every angle to uncover ways of sustaining a healthy connection with your work. You'll see how to stay productive even on the longest projects. You'll create a workflow that works with you, not against you. And you'll learn how to deal with clients whose goals don't align with your own. If you don't handle them just right, issues such as these can crush even the most seasoned, motivated developer. But with the right approach, you can transcend these common problems and become the professional developer you want to be. In more than 50 nuggets of wisdom, you'll learn: Why many traditional approaches to process and development roles in this industry are wrong - and how to sniff them out. Why you must always say "no" to the software pet project and open-ended timelines. How to incorporate code generation into your development process, and why its benefits go far beyond just faster code output. What to do when your client or end user disagrees with an approach you believe in. How to pay your knowledge forward to future generations of programmers through teaching and evangelism. If you're in this industry for the long run, you'll be coming back to this book again and again.
Data Science for Business: What you need to know about data mining and data-analytic thinking
Foster Provost - 2013
This guide also helps you understand the many data-mining techniques in use today.Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.Understand how data science fits in your organization—and how you can use it for competitive advantageTreat data as a business asset that requires careful investment if you’re to gain real valueApproach business problems data-analytically, using the data-mining process to gather good data in the most appropriate wayLearn general concepts for actually extracting knowledge from dataApply data science principles when interviewing data science job candidates