Book picks similar to
Feature Engineering for Machine Learning by Alice Zheng
data-science
machine-learning
data
computer-science
Working Effectively with Legacy Code
Michael C. Feathers - 2004
This book draws on material Michael created for his renowned Object Mentor seminars, techniques Michael has used in mentoring to help hundreds of developers, technical managers, and testers bring their legacy systems under control. The topics covered include: Understanding the mechanics of software change, adding features, fixing bugs, improving design, optimizing performance Getting legacy code into a test harness Writing tests that protect you against introducing new problems Techniques that can be used with any language or platform, with examples in Java, C++, C, and C# Accurately identifying where code changes need to be made Coping with legacy systems that aren't object-oriented Handling applications that don't seem to have any structureThis book also includes a catalog of twenty-four dependency-breaking techniques that help you work with program elements in isolation and make safer changes.
Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)
Jiawei Han - 2000
Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data- including stream data, sequence data, graph structured data, social network data, and multi-relational data.A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business dataUpdates that incorporate input from readers, changes in the field, and more material on statistics and machine learningDozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projectsComplete classroom support for instructors at www.mkp.com/datamining2e companion site
Silence on the Wire: A Field Guide to Passive Reconnaissance and Indirect Attacks
Michal Zalewski - 2005
Silence on the Wire uncovers these silent attacks so that system administrators can defend against them, as well as better understand and monitor their systems.Silence on the Wire dissects several unique and fascinating security and privacy problems associated with the technologies and protocols used in everyday computing, and shows how to use this knowledge to learn more about others or to better defend systems. By taking an indepth look at modern computing, from hardware on up, the book helps the system administrator to better understand security issues, and to approach networking from a new, more creative perspective. The sys admin can apply this knowledge to network monitoring, policy enforcement, evidence analysis, IDS, honeypots, firewalls, and forensics.
Statistics Done Wrong: The Woefully Complete Guide
Alex Reinhart - 2013
Politicians and marketers present shoddy evidence for dubious claims all the time. But smart people make mistakes too, and when it comes to statistics, plenty of otherwise great scientists--yes, even those published in peer-reviewed journals--are doing statistics wrong."Statistics Done Wrong" comes to the rescue with cautionary tales of all-too-common statistical fallacies. It'll help you see where and why researchers often go wrong and teach you the best practices for avoiding their mistakes.In this book, you'll learn: - Why "statistically significant" doesn't necessarily imply practical significance- Ideas behind hypothesis testing and regression analysis, and common misinterpretations of those ideas- How and how not to ask questions, design experiments, and work with data- Why many studies have too little data to detect what they're looking for-and, surprisingly, why this means published results are often overestimates- Why false positives are much more common than "significant at the 5% level" would suggestBy walking through colorful examples of statistics gone awry, the book offers approachable lessons on proper methodology, and each chapter ends with pro tips for practicing scientists and statisticians. No matter what your level of experience, "Statistics Done Wrong" will teach you how to be a better analyst, data scientist, or researcher.
Code: The Hidden Language of Computer Hardware and Software
Charles Petzold - 1999
And through CODE, we see how this ingenuity and our very human compulsion to communicate have driven the technological innovations of the past two centuries. Using everyday objects and familiar language systems such as Braille and Morse code, author Charles Petzold weaves an illuminating narrative for anyone who’s ever wondered about the secret inner life of computers and other smart machines. It’s a cleverly illustrated and eminently comprehensible story—and along the way, you’ll discover you’ve gained a real context for understanding today’s world of PCs, digital media, and the Internet. No matter what your level of technical savvy, CODE will charm you—and perhaps even awaken the technophile within.
Bitcoin for the Befuddled
Conrad Barski - 2014
Already used by people and companies around the world, many forecast that Bitcoin could radically transform the global economy. The value of a bitcoin has soared from less than a dollar in 2011 to well over $1000 in 2013, with many spikes and crashes along the way. The rise in value has brought Bitcoin into the public eye, but the cryptocurrency still confuses many people. Bitcoin for the Befuddled covers everything you need to know about Bitcoin—what it is, how it works, and how to acquire, store, and use bitcoins safely and securely. You'll also learn about Bitcoin's history, its complex cryptography, and its potential impact on trade and commerce. The book includes a humorous, full-color comic explaining Bitcoin concepts, plus a glossary of terms for easy reference.
Bayes Theorem: A Visual Introduction For Beginners
Dan Morris - 2016
Bayesian statistics is taught in most first-year statistics classes across the nation, but there is one major problem that many students (and others who are interested in the theorem) face. The theorem is not intuitive for most people, and understanding how it works can be a challenge, especially because it is often taught without visual aids. In this guide, we unpack the various components of the theorem and provide a basic overview of how it works - and with illustrations to help. Three scenarios - the flu, breathalyzer tests, and peacekeeping - are used throughout the booklet to teach how problems involving Bayes Theorem can be approached and solved. Over 60 hand-drawn visuals are included throughout to help you work through each problem as you learn by example. The illustrations are simple, hand-drawn, and in black and white. For those interested, we have also included sections typically not found in other beginner guides to Bayes Rule. These include: A short tutorial on how to understand problem scenarios and find P(B), P(A), and P(B|A). For many people, knowing how to approach scenarios and break them apart can be daunting. In this booklet, we provide a quick step-by-step reference on how to confidently understand scenarios.A few examples of how to think like a Bayesian in everyday life. Bayes Rule might seem somewhat abstract, but it can be applied to many areas of life and help you make better decisions. It is a great tool that can help you with critical thinking, problem-solving, and dealing with the gray areas of life. A concise history of Bayes Rule. Bayes Theorem has a fascinating 200+ year history, and we have summed it up for you in this booklet. From its discovery in the 1700’s to its being used to break the German’s Enigma Code during World War 2, its tale is quite phenomenal.Fascinating real-life stories on how Bayes formula is used in everyday life.From search and rescue to spam filtering and driverless cars, Bayes is used in many areas of modern day life. We have summed up 3 examples for you and provided an example of how Bayes could be used.An expanded definitions, notations, and proof section.We have included an expanded definitions and notations sections at the end of the booklet. In this section we define core terms more concretely, and also cover additional terms you might be confused about. A recommended readings section.From The Theory That Would Not Die to a few other books, there are a number of recommendations we have for further reading. Take a look! If you are a visual learner and like to learn by example, this intuitive booklet might be a good fit for you. Bayesian statistics is an incredibly fascinating topic and likely touches your life every single day. It is a very important tool that is used in data analysis throughout a wide-range of industries - so take an easy dive into the theorem for yourself with a visual approach!If you are looking for a short beginners guide packed with visual examples, this booklet is for you.
Jenkins: The Definitive Guide
John Ferguson Smart - 2011
This complete guide shows you how to automate your build, integration, release, and deployment processes with Jenkins—and demonstrates how CI can save you time, money, and many headaches.
Ideal for developers, software architects, and project managers, Jenkins: The Definitive Guide is both a CI tutorial and a comprehensive Jenkins reference. Through its wealth of best practices and real-world tips, you'll discover how easy it is to set up a CI service with Jenkins.
Learn how to install, configure, and secure your Jenkins server
Organize and monitor general-purpose build jobs
Integrate automated tests to verify builds, and set up code quality reporting
Establish effective team notification strategies and techniques
Configure build pipelines, parameterized jobs, matrix builds, and other advanced jobs
Manage a farm of Jenkins servers to run distributed builds
Implement automated deployment and continuous delivery
Hacker's Delight
Henry S. Warren Jr. - 2002
Aiming to tell the dark secrets of computer arithmetic, this title is suitable for library developers, compiler writers, and lovers of elegant hacks.
Algorithm Design
Jon Kleinberg - 2005
The book teaches a range of design and analysis techniques for problems that arise in computing applications. The text encourages an understanding of the algorithm design process and an appreciation of the role of algorithms in the broader field of computer science.
jQuery Pocket Reference
David Flanagan - 2010
This book is indispensable for anyone who is serious about using jQuery for non-trivial applications." -- Raffaele Cecco, longtime developer of video games, including Cybernoid, Exolon, and StormlordjQuery is the "write less, do more" JavaScript library. Its powerful features and ease of use have made it the most popular client-side JavaScript framework for the Web. This book is jQuery's trusty companion: the definitive "read less, learn more" guide to the library.jQuery Pocket Reference explains everything you need to know about jQuery, completely and comprehensively. You'll learn how to:Select and manipulate document elementsAlter document structureHandle and trigger eventsCreate visual effects and animationsScript HTTP with Ajax utilitiesUse jQuery's selectors and selection methods, utilities, plugins and moreThe 25-page quick reference summarizes the library, listing all jQuery methods and functions, with signatures and descriptions.
Rebooting AI: Building Artificial Intelligence We Can Trust
Gary F. Marcus - 2019
Professors Gary Marcus and Ernest Davis have spent their careers at the forefront of AI research and have witnessed some of the greatest milestones in the field, but they argue that a computer winning in games like Jeopardy and go does not signal that we are on the doorstep of fully autonomous cars or superintelligent machines. The achievements in the field thus far have occurred in closed systems with fixed sets of rules. These approaches are too narrow to achieve genuine intelligence. The world we live in is wildly complex and open-ended. How can we bridge this gap? What will the consequences be when we do? Marcus and Davis show us what we need to first accomplish before we get there and argue that if we are wise along the way, we won't need to worry about a future of machine overlords. If we heed their advice, humanity can create an AI that we can trust in our homes, our cars, and our doctor's offices. Reboot provides a lucid, clear-eyed assessment of the current science and offers an inspiring vision of what we can achieve and how AI can make our lives better.
sed & awk
Dale Dougherty - 1990
The most common operation done with sed is substitution, replacing one block of text with another.
awk is a complete programming language. Unlike many conventional languages, awk is "data driven" -- you specify what kind of data you are interested in and the operations to be performed when that data is found. awk does many things for you, including automatically opening and closing data files, reading records, breaking the records up into fields, and counting the records. While awk provides the features of most conventional programming languages, it also includes some unconventional features, such as extended regular expression matching and associative arrays. sed & awk describes both programs in detail and includes a chapter of example sed and awk scripts.
This edition covers features of sed and awk that are mandated by the POSIX standard. This most notably affects awk, where POSIX standardized a new variable, CONVFMT, and new functions, toupper() and tolower(). The CONVFMT variable specifies the conversion format to use when converting numbers to strings (awk used to use OFMT for this purpose). The toupper() and tolower() functions each take a (presumably mixed case) string argument and return a new version of the string with all letters translated to the corresponding case.
In addition, this edition covers GNU sed, newly available since the first edition. It also updates the first edition coverage of Bell Labs nawk and GNU awk (gawk), covers mawk, an additional freely available implementation of awk, and briefly discusses three commercial versions of awk, MKS awk, Thompson Automation awk (tawk), and Videosoft (VSAwk).
Cracking the Coding Interview: 150 Programming Questions and Solutions
Gayle Laakmann McDowell - 2008
This is a deeply technical book and focuses on the software engineering skills to ace your interview. The book is over 500 pages and includes 150 programming interview questions and answers, as well as other advice.The full list of topics are as follows:The Interview ProcessThis section offers an overview on questions are selected and how you will be evaluated. What happens when you get a question wrong? When should you start preparing, and how? What language should you use? All these questions and more are answered.Behind the ScenesLearn what happens behind the scenes during your interview, how decisions really get made, who you interview with, and what they ask you. Companies covered include Google, Amazon, Yahoo, Microsoft, Apple and Facebook.Special SituationsThis section explains the process for experience candidates, Program Managers, Dev Managers, Testers / SDETs, and more. Learn what your interviewers are looking for and how much code you need to know.Before the InterviewIn order to ace the interview, you first need to get an interview. This section describes what a software engineer's resume should look like and what you should be doing well before your interview.Behavioral PreparationAlthough most of a software engineering interview will be technical, behavioral questions matter too. This section covers how to prepare for behavioral questions and how to give strong, structured responses.Technical Questions (+ 5 Algorithm Approaches)This section covers how to prepare for technical questions (without wasting your time) and teaches actionable ways to solve the trickiest algorithm problems. It also teaches you what exactly "good coding" is when it comes to an interview.150 Programming Questions and AnswersThis section forms the bulk of the book. Each section opens with a discussion of the core knowledge and strategies to tackle this type of question, diving into exactly how you break down and solve it. Topics covered include• Arrays and Strings• Linked Lists• Stacks and Queues• Trees and Graphs• Bit Manipulation• Brain Teasers• Mathematics and Probability• Object-Oriented Design• Recursion and Dynamic Programming• Sorting and Searching• Scalability and Memory Limits• Testing• C and C++• Java• Databases• Threads and LocksFor the widest degree of readability, the solutions are almost entirely written with Java (with the exception of C / C++ questions). A link is provided with the book so that you can download, compile, and play with the solutions yourself.Changes from the Fourth Edition: The fifth edition includes over 200 pages of new content, bringing the book from 300 pages to over 500 pages. Major revisions were done to almost every solution, including a number of alternate solutions added. The introductory chapters were massively expanded, as were the opening of each of the chapters under Technical Questions. In addition, 24 new questions were added.Cracking the Coding Interview, Fifth Edition is the most expansive, detailed guide on how to ace your software development / programming interviews.
Lean from the Trenches
Henrik Kniberg - 2011
Find out how the Swedish police combined XP, Scrum, and Kanban in a 60-person project. From start to finish, you'll see how to deliver a successful product using Lean principles. We start with an organization in desperate need of a new way of doing things and finish with a group of sixty, all working in sync to develop a scalable, complex system. You'll walk through the project step by step, from customer engagement, to the daily "cocktail party," version control, bug tracking, and release. In this honest look at what works--and what doesn't--you'll find out how to: Make quality everyone's business, not just the testers. Keep everyone moving in the same direction without micromanagement. Use simple and powerful metrics to aid in planning and process improvement. Balance between low-level feature focus and high-level system focus. You'll be ready to jump into the trenches and streamline your own development process.ContentsForewordPrefacePART I: HOW WE WORK1. About the Project1.1 Timeline 51.2 How We Sliced the Elephant 61.3 How We Involved the Customer 72. Structuring the Teams3. Attending the Daily Cocktail Party3.1 First Tier: Feature Team Daily Stand-up3.2 Second Tier: Sync Meetings per Specialty3.3 Third Tier: Project Sync Meeting4. The Project Board4.1 Our Cadences4.2 How We Handle Urgent Issues and Impediments5. Scaling the Kanban Boards6. Tracking the High-Level Goal7. Defining Ready and Done7.1 Ready for Development7.2 Ready for System Test7.3 How This Improved Collaboration 8. Handling Tech Stories8.1 Example 1: System Test Bottleneck8.2 Example 2: Day Before the Release8.3 Example 3: The 7-Meter Class9. Handling Bugs9.1 Continuous System Test9.2 Fix the Bugs Immediately9.3 Why We Limit the Number of Bugs in the Bug Tracker9.4 Visualizing Bugs9.5 Preventing Recurring Bugs10. Continuously Improving the Process10.1 Team Retrospectives10.2 Process Improvement Workshops10.3 Managing the Rate of Change11. Managing Work in Progress11.1 Using WIP Limits11.2 Why WIP Limits Apply Only to Features12. Capturing and Using Process Metrics12.1 Velocity (Features per Week)12.2 Why We Don’t Use Story Points12.3 Cycle Time (Weeks per Feature)12.4 Cumulative Flow12.5 Process Cycle Efficiency13. Planning the Sprint and Release13.1 Backlog Grooming13.2 Selecting the Top Ten Features13.3 Why We Moved Backlog Grooming Out of the Sprint Planning Meeting13.4 Planning the Release14. How We Do Version Control14.1 No Junk on the Trunk14.2 Team Branches14.3 System Test Branch15. Why We Use Only Physical Kanban Boards16. What We Learned16.1 Know Your Goal16.2 Experiment16.3 Embrace Failure16.4 Solve Real Problems16.5 Have Dedicated Change Agents16.6 Involve PeoplePART II: A CLOSER LOOK AT THE TECHNIQUES 17. Agile and Lean in a Nutshell17.1 Agile in a Nutshell17.2 Lean in a Nutshell17.3 Scrum in a Nutshell17.4 XP in a Nutshell17.5 Kanban in a Nutshell18. Reducing the Test Automation Backlog18.1 What to Do About It18.2 How to Improve Test Coverage a Little Bit Each Iteration18.3 Step 1: List Your Test Cases18.4 Step 2: Classify Each Test18.5 Step 3: Sort the List in Priority Order18.6 Step 4: Automate a Few Tests Each Iteration18.7 Does This Solve the Problem?19. Sizing the Backlog with Planning Poker19.1 Estimating Without Planning Poker19.2 Estimating with Planning Poker19.3 Special Cards20. Cause-Effect Diagrams20.1 Solve Problems, Not Symptoms20.2 The Lean Problem-Solving Approach: A3 Thinking20.3 How to Use Cause-Effect Diagrams20.4 Example 1: Long Release Cycle20.5 Example 2: Defects Released to Production20.6 Example 3: Lack of Pair Programming20.7 Example 4: Lots of Problems20.8 Practical Issues: How to Create and Maintain the Diagrams20.9 Pitfalls20.10 Why Use Cause-Effect Diagrams?21. Final WordsA1. Glossary: How We Avoid Buzzword BingoIndex