Book picks similar to
Analyzing Computer System Performance with Perl: : PDQ by Neil J. Gunther
data
performanceengine<br/>ering
perl
want-read
Metadata
Jeffrey Pomerantz - 2015
When "metadata" became breaking news, appearing in stories about surveillance by the National Security Agency, many members of the public encountered this once-obscure term from information science for the first time. Should people be reassured that the NSA was "only" collecting metadata about phone calls--information about the caller, the recipient, the time, the duration, the location--and not recordings of the conversations themselves? Or does phone call metadata reveal more than it seems? In this book, Jeffrey Pomerantz offers an accessible and concise introduction to metadata.In the era of ubiquitous computing, metadata has become infrastructural, like the electrical grid or the highway system. We interact with it or generate it every day. It is not, Pomerantz tell us, just "data about data." It is a means by which the complexity of an object is represented in a simpler form. For example, the title, the author, and the cover art are metadata about a book. When metadata does its job well, it fades into the background; everyone (except perhaps the NSA) takes it for granted.Pomerantz explains what metadata is, and why it exists. He distinguishes among different types of metadata--descriptive, administrative, structural, preservation, and use--and examines different users and uses of each type. He discusses the technologies that make modern metadata possible, and he speculates about metadata's future. By the end of the book, readers will see metadata everywhere. Because, Pomerantz warns us, it's metadata's world, and we are just living in it.
The Productive Programmer
Neal Ford - 2008
The Productive Programmer offers critical timesaving and productivity tools that you can adopt right away, no matter what platform you use. Master developer Neal Ford not only offers advice on the mechanics of productivity-how to work smarter, spurn interruptions, get the most out your computer, and avoid repetition-he also details valuable practices that will help you elude common traps, improve your code, and become more valuable to your team. You'll learn to:Write the test before you write the codeManage the lifecycle of your objects fastidiously Build only what you need now, not what you might need later Apply ancient philosophies to software development Question authority, rather than blindly adhere to standardsMake hard things easier and impossible things possible through meta-programming Be sure all code within a method is at the same level of abstraction Pick the right editor and assemble the best tools for the job This isn't theory, but the fruits of Ford's real-world experience as an Application Architect at the global IT consultancy ThoughtWorks. Whether you're a beginner or a pro with years of experience, you'll improve your work and your career with the simple and straightforward principles in The Productive Programmer.
Python for Data Analysis
Wes McKinney - 2011
It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you'll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It's ideal for analysts new to Python and for Python programmers new to scientific computing.Use the IPython interactive shell as your primary development environmentLearn basic and advanced NumPy (Numerical Python) featuresGet started with data analysis tools in the pandas libraryUse high-performance tools to load, clean, transform, merge, and reshape dataCreate scatter plots and static or interactive visualizations with matplotlibApply the pandas groupby facility to slice, dice, and summarize datasetsMeasure data by points in time, whether it's specific instances, fixed periods, or intervalsLearn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Data Science at the Command Line: Facing the Future with Time-Tested Tools
Jeroen Janssens - 2014
You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Trevor Hastie - 2001
With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.
Security in Computing
Charles P. Pfleeger - 1988
In this newFourth Edition, Charles P. Pfleeger and Shari Lawrence Pfleeger have thoroughly updated their classic guide to reflect today's newest technologies, standards, and trends. The authors first introduce the core concepts and vocabulary of computer security, including cryptography. Next, they systematically identify and assess threats now facing programs, operating systems, databases, and networks. For each threat, they offer best-practice responses. Security in Computing, Fourth Edition, goes beyond technology, covering crucial management issues you face in protecting infrastructure and information. This edition contains an all-new chapter on the economics of cybersecurity, and making the business case for security investments. Another new chapter addresses privacy--from data mining to identity theft, to RFID and e-voting. New coverage also includes Programming mistakes that compromise security: man-in-the-middle, timing, and privilege escalation Web application threats and vulnerabilities Networks of compromised systems: bots, botnets, and drones Rootkits--including the notorious Sony XCP Wi-Fi network security challenges, standards, and techniques New malicious code attacks, including false interfaces and keystroke loggers Improving code quality: software engineering, testing, and liability approaches Biometric authentication: capabilities and limitations Using Advanced Encryption System (AES) more effectively Balancing efficiency and piracy control in music and other digital content Defending against new cryptanalytic attacks against RSA, DES, and SHA Responding to the emergence of organized attacker groups pursuing profit 0132390779B0721200 Every day, the news media giv
Becoming a Better Programmer
Pete Goodliffe - 2014
Code Craft author Pete Goodliffe presents a collection of useful techniques and approaches to the art and craft of programming that will help boost your career and your well-being.Goodliffe presents sound advice that he's learned in 15 years of professional programming. The book's standalone chapters span the range of a software developer's life--dealing with code, learning the trade, and improving performance--with no language or industry bias. Whether you're a seasoned developer, a neophyte professional, or a hobbyist, you'll find valuable tips in five independent categories:Code-level techniques for crafting lines of code, testing, debugging, and coping with complexityPractices, approaches, and attitudes: keep it simple, collaborate well, reuse, and create malleable codeTactics for learning effectively, behaving ethically, finding challenges, and avoiding stagnationPractical ways to complete things: use the right tools, know what "done" looks like, and seek help from colleaguesHabits for working well with others, and pursuing development as a social activity
Blockchain Basics: A Non-Technical Introduction in 25 Steps
Daniel Drescher - 2017
No mathematical formulas, program code, or computer science jargon are used.No previous knowledge in computer science, mathematics, programming, or cryptography is required. Terminology is explained through pictures, analogies, and metaphors.This book bridges the gap that exists between purely technical books about the blockchain and purely business-focused books. It does so by explaining both the technical concepts that make up the blockchain and their role in business-relevant applications.What You Will Learn: • What the blockchain is• Why it is needed and what problem it solves• Why there is so much excitement about the blockchain and its potential• Major components and their purpose• How various components of the blockchain work and interact• Limitations, why they exist, and what has been done to overcome them• Major application scenariosWho This Book Is For: Everyone who wants to get a general idea of what blockchain technology is, how it works, and how it will potentially change the financial system as we know it.
Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference
Cameron Davidson-Pilon - 2014
However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power.
Bayesian Methods for Hackers
illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.
Refactoring Databases: Evolutionary Database Design
Scott W. Ambler - 2006
Now, for the first time, leading agile methodologist Scott Ambler and renowned consultantPramodkumar Sadalage introduce powerful refactoring techniquesspecifically designed for database systems. Ambler and Sadalagedemonstrate how small changes to table structures, data, storedprocedures, and triggers can significantly enhance virtually anydatabase design - without changing semantic
Graph Databases
Ian Robinson - 2013
With this practical book, you’ll learn how to design and implement a graph database that brings the power of graphs to bear on a broad range of problem domains. Whether you want to speed up your response to user queries or build a database that can adapt as your business evolves, this book shows you how to apply the schema-free graph model to real-world problems.Learn how different organizations are using graph databases to outperform their competitors. With this book’s data modeling, query, and code examples, you’ll quickly be able to implement your own solution.Model data with the Cypher query language and property graph modelLearn best practices and common pitfalls when modeling with graphsPlan and implement a graph database solution in test-driven fashionExplore real-world examples to learn how and why organizations use a graph databaseUnderstand common patterns and components of graph database architectureUse analytical techniques and algorithms to mine graph database information
Do Dice Play God?: The Mathematics of Uncertainty
Ian Stewart - 2019
We want to be able to figure out who will win an election, if the stock market will crash, or if a suspect definitely committed a crime. But the odds are not in our favor. Life is full of uncertainty --- indeed, scientific advances indicate that the universe might be fundamentally inexact --- and humans are terrible at guessing. When asked to predict the outcome of a chance event, we are almost always wrong.Thankfully, there is hope. As award-winning mathematician Ian Stewart reveals, over the course of history, mathematics has given us some of the tools we need to better manage the uncertainty that pervades our lives. From forecasting, to medical research, to figuring out how to win Let's Make a Deal, Do Dice Play God? is a surprising and satisfying tour of what we can know, and what we never will.
Practical SQL: A Beginner's Guide to Storytelling with Data
Anthony DeBarros - 2022
An approachable guide to programming in SQL (Structured Query Language) that will teach even beginning programmers how to build powerful databases and analyze data to find meaningful information.Practical SQL is an approachable and fast-paced guide to SQL (Structured Query Language) written by longtime professional journalist Anthony DeBarros. SQL is the primary tool that programmers, web developers, researchers, journalists, and others use to explore data in a database. DeBarros focuses on using SQL to find the story in data, with the aid of the popular open-source database PostgreSQL and the pgAdmin interface.This thoroughly revised second edition includes a new chapter describing how to set up PostgreSQL and more extensive discussion of pgAdmin's best features. The author has also added a chapter on the JSON data format that shows readers how to store and query JSON data. DeBarros has also updated the data in the book throughout, added coverage of additional topics, and perfected the book's examples.Readers love DeBarros's use of exercises and real-world examples that demonstrate how to:- Create databases and related tables using your own data - Correctly define data typesAggregate, sort, and filter data to find patterns - Clean their data and transfer data as text files - Create advanced queries and automate tasksThis book uses PostgreSQL, but the SQL syntax is applicable to many database applications, including Microsoft SQL Server and MySQL.
A Practical Guide to Linux Commands, Editors, and Shell Programming
Mark G. Sobell - 2005
The book is a complete revision of the commands section of Sobell's Practical Guide to Linux - a proven best-seller. The book is Linux distribution and release agnostic. It will appeal to users of ALL Linux distributions. Superior examples make this book the the best option on the market! System administrators, software developers, quality assurance engineers and others working on a Linux system need to work from the command line in order to be effective. Linux is famous for its huge number of command line utility programs, and the programs themselves are famous for their large numbers of options, switches, and configuration files. But the truth is that users will only use a limited (but still significant) number of these utilities on a recurring basis, and then only with a subset of the most important and useful options, switches and configuration files. This book cuts through all the noise and shows them which utilities are most useful, and which options most important. And it contains examples, lot's and lot's of examples. programmability. Utilities are designed, by default, to work wtih other utilities within shell programs as a way of automating system tasks. This book contains a superb introduction to Linux shell programming. And since shell programmers need to write their programs in text editors, this book covers the two most popular ones: vi and emacs.
Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions
Michael G. Milton - 2009
If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to:Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.