Book picks similar to
Building Data Science Teams by D.J. Patil
data-science
data
management
business
Machine Learning: A Probabilistic Perspective
Kevin P. Murphy - 2012
Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach.The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package—PMTK (probabilistic modeling toolkit)—that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Universal Methods of Design: 100 Ways to Research Complex Problems, Develop Innovative Ideas, and Design Effective Solutions
Bella Martin - 2012
Universal Methods of Design serves as an invaluable compendium of methods that can be easily referenced and used by cross-disciplinary teams in nearly any design project. Methods and techniques are organized alphabetically for ongoing, quick reference. Each method is presented in a two-page format. The left-hand page contains a concise description of the method, accompanied by references for further reading. On the right-hand page, images and cases studies for each method are presented visually. The relevant phases for design application are highlighted as numbered icons along the right side of the page, from phases 1 (planning) through 5 (launch and monitor).Build more meaningful products with these methods and more: A/B Testing, Affinity Diagramming, Behavioral Mapping, Bodystorming, Contextual Design, Critical Incident Technique, Directed Storytelling, Flexible Modeling, Image Boards, Graffiti Walls, Heuristic Evaluation, Parallel Prototyping, Simulation Exercises, Touchstone Tours, and Weighted Matrix. This essential guide:Dismantles the myth that user research methods are complicated, expensive, and time-consumingCreates a shared meaning for cross-disciplinary design teamsIllustrates methods with compelling visualizations and case studiesCharacterizes each method at a glanceIndicates when methods are best employed to help prioritize appropriate design research strategiesUniversal Methods of Design is an essential resource for designers of all levels and specializations.
The Fourth Paradigm: Data-Intensive Scientific Discovery
Tony Hey - 2009
Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloud-computing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.
The Design of Everyday Things
Donald A. Norman - 1988
It could forever change how you experience and interact with your physical surroundings, open your eyes to the perversity of bad design and the desirability of good design, and raise your expectations about how things should be designed.B & W photographs and illustrations throughout.
Elasticsearch: The Definitive Guide: A Distributed Real-Time Search and Analytics Engine
Clinton Gormley - 2014
This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with the complexities of human language, geolocation, and relationships.If you're a newcomer to both search and distributed systems, you'll quickly learn how to integrate Elasticsearch into your application. More experienced users will pick up lots of advanced techniques. Throughout the book, you'll follow a problem-based approach to learn why, when, and how to use Elasticsearch features.Understand how Elasticsearch interprets data in your documentsIndex and query your data to take advantage of search concepts such as relevance and word proximityHandle human language through the effective use of analyzers and queriesSummarize and group data to show overall trends, with aggregations and analyticsUse geo-points and geo-shapes--Elasticsearch's approaches to geolocationModel your data to take advantage of Elasticsearch's horizontal scalabilityLearn how to configure and monitor your cluster in production
The Tyranny of Metrics
Jerry Z. Muller - 2017
But in our zeal to instill the evaluation process with scientific rigor, we've gone from measuring performance to fixating on measuring itself. The result is a tyranny of metrics that threatens the quality of our lives and most important institutions. In this timely and powerful book, Jerry Muller uncovers the damage our obsession with metrics is causing--and shows how we can begin to fix the problem.Filled with examples from education, medicine, business and finance, government, the police and military, and philanthropy and foreign aid, this brief and accessible book explains why the seemingly irresistible pressure to quantify performance distorts and distracts, whether by encouraging "gaming the stats" or "teaching to the test." That's because what can and does get measured is not always worth measuring, may not be what we really want to know, and may draw effort away from the things we care about. Along the way, we learn why paying for measured performance doesn't work, why surgical scorecards may increase deaths, and much more. But metrics can be good when used as a complement to--rather than a replacement for--judgment based on personal experience, and Muller also gives examples of when metrics have been beneficial.Complete with a checklist of when and how to use metrics, The Tyranny of Metrics is an essential corrective to a rarely questioned trend that increasingly affects us all.
Big Data for Dummies
Judith Hurwitz - 2013
Data sets such as customer transactions for a mega-retailer, weather patterns monitored by meteorologists, or social network activity can quickly outpace the capacity of traditional data management tools. If you need to develop or manage big data solutions, you'll appreciate how these four experts define, explain, and guide you through this new and often confusing concept. You'll learn what it is, why it matters, and how to choose and implement solutions that work.Effectively managing big data is an issue of growing importance to businesses, not-for-profit organizations, government, and IT professionals Authors are experts in information management, big data, and a variety of solutions Explains big data in detail and discusses how to select and implement a solution, security concerns to consider, data storage and presentation issues, analytics, and much more Provides essential information in a no-nonsense, easy-to-understand style that is empowering Big Data For Dummies cuts through the confusion and helps you take charge of big data solutions for your organization.
Natural Language Processing with Python
Steven Bird - 2009
With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.Packed with examples and exercises, Natural Language Processing with Python will help you: Extract information from unstructured text, either to guess the topic or identify "named entities" Analyze linguistic structure in text, including parsing and semantic analysis Access popular linguistic databases, including WordNet and treebanks Integrate techniques drawn from fields as diverse as linguistics and artificial intelligenceThis book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.
Predictive Analytics for Dummies
Anasse Bari - 2013
Predictive Analytics For Dummies explores the power of predictive analytics and how you can use it to make valuable predictions for your business, or in fields such as advertising, fraud detection, politics, and others. This practical book does not bog you down with loads of mathematical or scientific theory, but instead helps you quickly see how to use the right algorithms and tools to collect and analyze data and apply it to make predictions.Topics include using structured and unstructured data, building models, creating a predictive analysis roadmap, setting realistic goals, budgeting, and much more.Shows readers how to use Big Data and data mining to discover patterns and make predictions for tech-savvy businesses Helps readers see how to shepherd predictive analytics projects through their companies Explains just enough of the science and math, but also focuses on practical issues such as protecting project budgets, making good presentations, and more Covers nuts-and-bolts topics including predictive analytics basics, using structured and unstructured data, data mining, and algorithms and techniques for analyzing data Also covers clustering, association, and statistical models; creating a predictive analytics roadmap; and applying predictions to the web, marketing, finance, health care, and elsewhere Propose, produce, and protect predictive analytics projects through your company with Predictive Analytics For Dummies.
Java Concurrency in Practice
Brian Goetz - 2005
Now this same team provides the best explanation yet of these new features, and of concurrency in general. Concurrency is no longer a subject for advanced users only. Every Java developer should read this book."--Martin BuchholzJDK Concurrency Czar, Sun Microsystems"For the past 30 years, computer performance has been driven by Moore's Law; from now on, it will be driven by Amdahl's Law. Writing code that effectively exploits multiple processors can be very challenging. Java Concurrency in Practice provides you with the concepts and techniques needed to write safe and scalable Java programs for today's--and tomorrow's--systems."--Doron RajwanResearch Scientist, Intel Corp"This is the book you need if you're writing--or designing, or debugging, or maintaining, or contemplating--multithreaded Java programs. If you've ever had to synchronize a method and you weren't sure why, you owe it to yourself and your users to read this book, cover to cover."--Ted NewardAuthor of Effective Enterprise Java"Brian addresses the fundamental issues and complexities of concurrency with uncommon clarity. This book is a must-read for anyone who uses threads and cares about performance."--Kirk PepperdineCTO, JavaPerformanceTuning.com"This book covers a very deep and subtle topic in a very clear and concise way, making it the perfect Java Concurrency reference manual. Each page is filled with the problems (and solutions!) that programmers struggle with every day. Effectively exploiting concurrency is becoming more and more important now that Moore's Law is delivering more cores but not faster cores, and this book will show you how to do it."--Dr. Cliff ClickSenior Software Engineer, Azul Systems"I have a strong interest in concurrency, and have probably written more thread deadlocks and made more synchronization mistakes than most programmers. Brian's book is the most readable on the topic of threading and concurrency in Java, and deals with this difficult subject with a wonderful hands-on approach. This is a book I am recommending to all my readers of The Java Specialists' Newsletter, because it is interesting, useful, and relevant to the problems facing Java developers today."--Dr. Heinz KabutzThe Java Specialists' Newsletter"I've focused a career on simplifying simple problems, but this book ambitiously and effectively works to simplify a complex but critical subject: concurrency. Java Concurrency in Practice is revolutionary in its approach, smooth and easy in style, and timely in its delivery--it's destined to be a very important book."--Bruce TateAuthor of Beyond Java" Java Concurrency in Practice is an invaluable compilation of threading know-how for Java developers. I found reading this book intellectually exciting, in part because it is an excellent introduction to Java's concurrency API, but mostly because it captures in a thorough and accessible way expert knowledge on threading not easily found elsewhere."--Bill VennersAuthor of Inside the Java Virtual MachineThreads are a fundamental part of the Java platform. As multicore processors become the norm, using concurrency effectively becomes essential for building high-performance applications. Java SE 5 and 6 are a huge step forward for the development of concurrent applications, with improvements to the Java Virtual Machine to support high-performance, highly scalable concurrent classes and a rich set of new concurrency building blocks. In Java Concurrency in Practice , the creators of these new facilities explain not only how they work and how to use them, but also the motivation and design patterns behind them.However, developing, testing, and debugging multithreaded programs can still be very difficult; it is all too easy to create concurrent programs that appear to work, but fail when it matters most: in production, under heavy load. Java Concurrency in Practice arms readers with both the theoretical underpinnings and concrete techniques for building reliable, scalable, maintainable concurrent applications. Rather than simply offering an inventory of concurrency APIs and mechanisms, it provides design rules, patterns, and mental models that make it easier to build concurrent programs that are both correct and performant.This book covers:Basic concepts of concurrency and thread safety Techniques for building and composing thread-safe classes Using the concurrency building blocks in java.util.concurrent Performance optimization dos and don'ts Testing concurrent programs Advanced topics such as atomic variables, nonblocking algorithms, and the Java Memory Model
Spam Nation: The Inside Story of Organized Cybercrime — from Global Epidemic to Your Front Door
Brian Krebs - 2014
Tracing the rise, fall, and alarming resurrection of the digital mafia behind the two largest spam pharmacies and countless viruses, phishing, and spyware attacks he delivers the first definitive narrative of the global spam problem and its threat to consumers everywhere.Blending cutting-edge research, investigative reporting, and firsthand interviews, this terrifying true story reveals how we unwittingly invite these digital thieves into our lives every day. From unassuming computer programmers right next door to digital mobsters like "Cosma" who unleashed a massive malware attack that has stolen thousands of Americans' logins and passwords, Krebs uncovers the shocking lengths to which these people will go to profit from our data and our wallets.Not only are hundreds of thousands of Americans exposing themselves to fraud and dangerously toxic products from rogue online pharmacies, but even those who never open junk messages are at risk. As Krebs notes, spammers can—and do—hack into accounts through these emails, harvest personal information like usernames and passwords, and sell them on the digital black market. The fallout from this global epidemic doesn't just cost consumers and companies billions, it costs lives too.Fast-paced and utterly gripping, Spam Nation ultimately proposes concrete solutions for protecting ourselves online and stemming this tidal wave of cybercrime, before it's too late."Krebs's talent for exposing the weaknesses in online security has earned him respect in the IT business and loathing among cybercriminals. His track record of scoops has helped him become the rare blogger who supports himself on the strength of his reputation for hard-nosed reporting."
Bloomberg Businessweek
Mindstorms: Children, Computers, And Powerful Ideas
Seymour Papert - 1980
We have Mindstorms to thank for that. In this book, pioneering computer scientist Seymour Papert uses the invention of LOGO, the first child-friendly programming language, to make the case for the value of teaching children with computers. Papert argues that children are more than capable of mastering computers, and that teaching computational processes like de-bugging in the classroom can change the way we learn everything else. He also shows that schools saturated with technology can actually improve socialization and interaction among students and between students and teachers.
97 Things Every Software Architect Should Know: Collective Wisdom from the Experts
Richard Monson-Haefel - 2009
More than four dozen architects -- including Neal Ford, Michael Nygard, and Bill de hOra -- offer advice for communicating with stakeholders, eliminating complexity, empowering developers, and many more practical lessons they've learned from years of experience. Among the 97 principles in this book, you'll find useful advice such as:Don't Put Your Resume Ahead of the Requirements (Nitin Borwankar) Chances Are, Your Biggest Problem Isn't Technical (Mark Ramm) Communication Is King; Clarity and Leadership, Its Humble Servants (Mark Richards) Simplicity Before Generality, Use Before Reuse (Kevlin Henney) For the End User, the Interface Is the System (Vinayak Hegde) It's Never Too Early to Think About Performance (Rebecca Parsons) To be successful as a software architect, you need to master both business and technology. This book tells you what top software architects think is important and how they approach a project. If you want to enhance your career, 97 Things Every Software Architect Should Know is essential reading.
Designing for the Digital Age: How to Create Human-Centered Products and Services
Kim Goodwin - 2009
Designing successful products and services in the digital age requires a multi-disciplinary team with expertise in interaction design, visual design, industrial design, and other disciplines. It also takes the ability to come up with the big ideas that make a desirable product or service, as well as the skill and perseverance to execute on the thousand small ideas that get your design into the hands of users. It requires expertise in project management, user research, and consensus-building. This comprehensive, full-color volume addresses all of these and more with detailed how-to information, real-life examples, and exercises. Topics include assembling a design team, planning and conducting user research, analyzing your data and turning it into personas, using scenarios to drive requirements definition and design, collaborating in design meetings, evaluating and iterating your design, and documenting finished design in a way that works for engineers and stakeholders alike.
High Performance MySQL: Optimization, Backups, and Replication
Baron Schwartz - 2008
This guide also teaches you safe and practical ways to scale applications through replication, load balancing, high availability, and failover.
Updated to reflect recent advances in MySQL and InnoDB performance, features, and tools, this third edition not only offers specific examples of how MySQL works, it also teaches you why this system works as it does, with illustrative stories and case studies that demonstrate MySQL’s principles in action. With this book, you’ll learn how to think in MySQL.
Learn the effects of new features in MySQL 5.5, including stored procedures, partitioned databases, triggers, and views
Implement improvements in replication, high availability, and clustering
Achieve high performance when running MySQL in the cloud
Optimize advanced querying features, such as full-text searches
Take advantage of modern multi-core CPUs and solid-state disks
Explore backup and recovery strategies—including new tools for hot online backups