Find a book to read

Book picks similar to
Julia for Data Science by Zacharias Voulgaris

machine-learning

programming-collection

research-a-bit-more

computing

Hands-On Machine Learning with Scikit-Learn and TensorFlow

Aurélien Géron - 2017

A series of Deep Learning breakthroughs have boosted the whole field of machine learning over the last decade.

Now that machine learning is thriving, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how.By using concrete examples, minimal theory, and two production-ready Python frameworks—Scikit-Learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn how to use a range of techniques, starting with simple Linear Regression and progressing to Deep Neural Networks. If you have some programming experience and you’re ready to code a machine learning project, this guide is for you.This hands-on book shows you how to use:Scikit-Learn, an accessible framework that implements many algorithms efficiently and serves as a great machine learning entry pointTensorFlow, a more complex library for distributed numerical computation, ideal for training and running very large neural networksPractical code examples that you can apply without learning excessive machine learning theory or algorithm details

Zero Day: The Threat In Cyberspace

Robert O'Harrow Jr. - 2013

Will the world’s next war be fought in cyberspace?   "It's going to happen," said former National Defense University Professor Dan Kuehl.  So much of the world’s activity takes place on the internet now – including commerce, banking and communications -- the Pentagon has declared war in cyberspace an inevitability.

For more than a year, Washington Post reporter Robert O'Harrow has explored the threats proliferating in our digital universe. This eBook is a compilation of that reporting. With chapters built around real people, including hackers, security researchers and corporate executives, this book will help regular people, lawmakers and businesses better understand the mind-bending challenge of keeping the internet safe from hackers and security breaches -- and all out war.

Big Data: A Revolution That Will Transform How We Live, Work, and Think

Viktor Mayer-Schönberger - 2013

A revelatory exploration of the hottest trend in technology and the dramatic impact it will have on the economy, science, and society at large.Which paint color is most likely to tell you that a used car is in good shape? How can officials identify the most dangerous New York City manholes before they explode? And how did Google searches predict the spread of the H1N1 flu outbreak?The key to answering these questions, and many more, is big data.

“Big data” refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it. This emerging science can translate myriad phenomena—from the price of airline tickets to the text of millions of books—into searchable form, and uses our increasing computing power to unearth epiphanies that we never could have seen before. A revolution on par with the Internet or perhaps even the printing press, big data will change the way we think about business, health, politics, education, and innovation in the years to come. It also poses fresh threats, from the inevitable end of privacy as we know it to the prospect of being penalized for things we haven’t even done yet, based on big data’s ability to predict our future behavior.In this brilliantly clear, often surprising work, two leading experts explain what big data is, how it will change our lives, and what we can do to protect ourselves from its hazards. Big Data is the first big book about the next big thing.www.big-data-book.com

Database Internals: A deep-dive into how distributed data systems work

Alex Petrov - 2019

When it comes to choosing, using, and maintaining a database, understanding its internals is essential.

But with so many distributed databases and tools available today, it’s often difficult to understand what each one offers and how they differ. With this practical guide, Alex Petrov guides developers through the concepts behind modern database and storage engine internals.Throughout the book, you’ll explore relevant material gleaned from numerous books, papers, blog posts, and the source code of several open source databases. These resources are listed at the end of parts one and two. You’ll discover that the most significant distinctions among many modern databases reside in subsystems that determine how storage is organized and how data is distributed.This book examines:Storage engines: Explore storage classification and taxonomy, and dive into B-Tree-based and immutable log structured storage engines, with differences and use-cases for eachDistributed systems: Learn step-by-step how nodes and processes connect and build complex communication patterns, from UDP to reliable consensus protocolsDatabase clusters: Discover how to achieve consistent models for replicated data

The Joy of Clojure

Michael Fogus - 2010

About the BookIf you've seen how dozens of lines of Java or Ruby can dissolve into just a few lines of Clojure, you'll know why the authors of this book call it a "joyful language." Clojure is a dialect of Lisp that runs on the JVM.

It combines the nice features of a scripting language with the powerful features of a production environment—features like persistent data structures and clean multithreading that you'll need for industrial-strength application development.The Joy of Clojure goes beyond just syntax to show you how to write fluent and idiomatic Clojure code. You'll learn a functional approach to programming and will master Lisp techniques that make Clojure so elegant and efficient. The book gives you easy access to hard soft ware areas like concurrency, interoperability, and performance. And it shows you how great it can be to think about problems the Clojure way. Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book. What's InsideThe what and why of ClojureHow to work with macrosHow to do elegant application designFunctional programming idiomsWritten for programmers coming to Clojure from another programming background—no prior experience with Clojure or Lisp is required.

Python Machine Learning

Sebastian Raschka - 2015

Link to the GitHub Repository containing the code examples and additional material: https://github.com/rasbt/python-machi...Many of the most innovative breakthroughs and exciting new technologies can be attributed to applications of machine learning.

We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world

Bandit Algorithms for Website Optimization

programming

machine-learning

data-science

John Myles White - 2012

When looking for ways to improve your website, how do you decide which changes to make? And which changes to keep? This concise book shows you how to use Multiarmed Bandit algorithms to measure the real-world value of any modifications you make to your site.

Author John Myles White shows you how this powerful class of algorithms can help you boost website traffic, convert visitors to customers, and increase many other measures of success.This is the first developer-focused book on bandit algorithms, which were previously described only in research papers. You’ll quickly learn the benefits of several simple algorithms—including the epsilon-Greedy, Softmax, and Upper Confidence Bound (UCB) algorithms—by working through code examples written in Python, which you can easily adapt for deployment on your own website.Learn the basics of A/B testing—and recognize when it’s better to use bandit algorithmsDevelop a unit testing framework for debugging bandit algorithmsGet additional code examples written in Julia, Ruby, and JavaScript with supplemental online materials

Data Smart: Using Data Science to Transform Information into Insight

John W. Foreman - 2013

Data Science gets thrown around in the press like it's magic.

Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions.But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope.Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.Each chapter will cover a different technique in a spreadsheet so you can follow along: - Mathematical optimization, including non-linear programming and genetic algorithms- Clustering via k-means, spherical k-means, and graph modularity- Data mining in graphs, such as outlier detection- Supervised AI through logistic regression, ensemble models, and bag-of-words models- Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation- Moving from spreadsheets into the R programming languageYou get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.

Modern CTO: Everything you need to know, to be a Modern CTO.

Joel Beasley - 2018

I highly recommend his podcast, book, and the tools he has created to improve the lives of developers and CTOs.

―Jacob Boudreau CTO of Stord | Forbes 30 Under 30 Joel's book and show provide incredible insights for young startup developers and fellow CTOs alike. Joel offers a human perspective and real practical advice on the challenges and opportunities facing every Modern CTO. ― Christian Saucier | Entrepreneur and P2P Systems Architect I've really come to respect what Joel is doing in the community. His podcast and book are filling a much needed hole and I'm excited to see what else the future has in store. ― Don Pawlowski Chief Technology Officer at University Tees Modern CTO Everything you need to know to be a Modern CTO. Developers are not CTOs, but developers can learn how to be CTOs. In Modern CTO, Joel Beasley provides readers with an in-depth road map on how to successfully navigate the unexplored and jagged transition between these two roles. Drawing from personal experience, Joel gives a refreshing take on the challenges, lessons, and things to avoid on this journey.Readers will learn how Modern CTOs: Manage deadlines Speak up Know when to abandon ship and build a better one Deal with poor code Avoid getting lost in the product and know what UX mistakes to watch out for Manage people and create momentum … plus much more Modern CTO is the ultimate book when making the leap from developer to CTO. Update: Kindle Formatting issues resolved 5/13/18. Thank you for the feedback.

Data Science at the Command Line: Facing the Future with Time-Tested Tools

Jeroen Janssens - 2014

This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist.

You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.To get you started--whether you're on Windows, OS X, or Linux--author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools.Discover why the command line is an agile, scalable, and extensible technology. Even if you're already comfortable processing data with, say, Python or R, you'll greatly improve your data science workflow by also leveraging the power of the command line.Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms

Machine Learning

Ethem Alpaydin - 2016

A concise overview of machine learning--computer programs that learn from data--the basis of such applications as voice recognition and driverless cars.Today, machine learning underlies a range of applications we use every day, from product recommendations to voice recognition--as well as some we don't yet use everyday, including driverless cars.

It is the basis for a new approach to artificial intelligence that aims to program computers to use example data or past experience to solve a given problem. In this volume in the MIT Press Essential Knowledge series, Ethem Alpayd�n offers a concise and accessible overview of the new AI. This expanded edition offers new material on such challenges facing machine learning as privacy, security, accountability, and bias. Alpayd�n, author of a popular textbook on machine learning, explains that as Big Data has gotten bigger, the theory of machine learning--the foundation of efforts to process that data into knowledge--has also advanced. He describes the evolution of the field, explains important learning algorithms, and presents example applications. He discusses the use of machine learning algorithms for pattern recognition; artificial neural networks inspired by the human brain; algorithms that learn associations between instances; and reinforcement learning, when an autonomous agent learns to take actions to maximize reward. In a new chapter, he considers transparency, explainability, and fairness, and the ethical and legal implications of making decisions based on data.

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference

Cameron Davidson-Pilon - 2014

Master Bayesian Inference through Practical Examples and Computation-Without Advanced Mathematical Analysis Bayesian methods of inference are deeply natural and extremely powerful.

However, most discussions of Bayesian inference rely on intensely complex mathematical analyses and artificial examples, making it inaccessible to anyone without a strong mathematical background. Now, though, Cameron Davidson-Pilon introduces Bayesian inference from a computational perspective, bridging theory to practice-freeing you to get results using computing power. Bayesian Methods for Hackers illuminates Bayesian inference through probabilistic programming with the powerful PyMC language and the closely related Python tools NumPy, SciPy, and Matplotlib. Using this approach, you can reach effective solutions in small increments, without extensive mathematical intervention. Davidson-Pilon begins by introducing the concepts underlying Bayesian inference, comparing it with other techniques and guiding you through building and training your first Bayesian model. Next, he introduces PyMC through a series of detailed examples and intuitive explanations that have been refined after extensive user feedback. You'll learn how to use the Markov Chain Monte Carlo algorithm, choose appropriate sample sizes and priors, work with loss functions, and apply Bayesian inference in domains ranging from finance to marketing. Once you've mastered these techniques, you'll constantly turn to this guide for the working PyMC code you need to jumpstart future projects. Coverage includes - Learning the Bayesian "state of mind" and its practical implications - Understanding how computers perform Bayesian inference - Using the PyMC Python library to program Bayesian analyses - Building and debugging models with PyMC - Testing your model's "goodness of fit" - Opening the "black box" of the Markov Chain Monte Carlo algorithm to see how and why it works - Leveraging the power of the "Law of Large Numbers" - Mastering key concepts, such as clustering, convergence, autocorrelation, and thinning - Using loss functions to measure an estimate's weaknesses based on your goals and desired outcomes - Selecting appropriate priors and understanding how their influence changes with dataset size - Overcoming the "exploration versus exploitation" dilemma: deciding when "pretty good" is good enough - Using Bayesian inference to improve A/B testing - Solving data science problems when only small amounts of data are available Cameron Davidson-Pilon has worked in many areas of applied mathematics, from the evolutionary dynamics of genes and diseases to stochastic modeling of financial prices. His contributions to the open source community include lifelines, an implementation of survival analysis in Python. Educated at the University of Waterloo and at the Independent University of Moscow, he currently works with the online commerce leader Shopify.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Trevor Hastie - 2001

During the past decade there has been an explosion in computation and information technology.

With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.

Designing Data-Intensive Applications

Martin Kleppmann - 2015

Data is at the center of many challenges in system design today.

Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures

Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale

Neha Narkhede - 2017

Every enterprise application creates data, whether it� s log messages, metrics, user activity, outgoing messages, or something else.

And how to move all of this data becomes nearly as important as the data itself. If you� re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds.Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you� ll learn Kafka� s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.Understand publish-subscribe messaging and how it fits in the big data ecosystem.Explore Kafka producers and consumers for writing and reading messagesUnderstand Kafka patterns and use-case requirements to ensure reliable data deliveryGet best practices for building data pipelines and applications with KafkaManage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasksLearn the most critical metrics among Kafka� s operational measurementsExplore how Kafka� s stream delivery capabilities make it a perfect source for stream processing systems

Book picks similar toJulia for Data Science by Zacharias Voulgaris

Hands-On Machine Learning with Scikit-Learn and TensorFlow

Zero Day: The Threat In Cyberspace

Big Data: A Revolution That Will Transform How We Live, Work, and Think

Database Internals: A deep-dive into how distributed data systems work

The Joy of Clojure

Python Machine Learning

Bandit Algorithms for Website Optimization

Data Smart: Using Data Science to Transform Information into Insight

Modern CTO: Everything you need to know, to be a Modern CTO.

Data Science at the Command Line: Facing the Future with Time-Tested Tools

Machine Learning

Bayesian Methods for Hackers: Probabilistic Programming and Bayesian Inference

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Designing Data-Intensive Applications

Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale

Book picks similar to
Julia for Data Science by Zacharias Voulgaris