Planning for Big Data


Edd Wilder-James - 2004
    From creating new data-driven products through to increasing operational efficiency, big data has the potential to makeyour organization both more competitive and more innovative.As this emerging field transitions from the bleeding edge to enterprise infrastructure, it's vital to understand not only the technologies involved, but the organizational and cultural demands of being data-driven.Written by O'Reilly Radar's experts on big data, this anthology describes:- The broad industry changes heralded by the big data era- What big data is, what it means to your business, and how to start solving data problems- The software that makes up the Hadoop big data stack, and the major enterprise vendors' Hadoop solutions- The landscape of NoSQL databases and their relative merits- How visualization plays an important part in data work

Mastering Regular Expressions


Jeffrey E.F. Friedl - 1997
    They are now standard features in a wide range of languages and popular tools, including Perl, Python, Ruby, Java, VB.NET and C# (and any language using the .NET Framework), PHP, and MySQL.If you don't use regular expressions yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage. If you think you know all you need to know about regularexpressions, this book is a stunning eye-opener.As this book shows, a command of regular expressions is an invaluable skill. Regular expressions allow you to code complex and subtle text processing that you never imagined could be automated. Regular expressions can save you time and aggravation. They can be used to craft elegant solutions to a wide range of problems. Once you've mastered regular expressions, they'll become an invaluable part of your toolkit. You will wonder how you ever got by without them.Yet despite their wide availability, flexibility, and unparalleled power, regular expressions are frequently underutilized. Yet what is power in the hands of an expert can be fraught with peril for the unwary. Mastering Regular Expressions will help you navigate the minefield to becoming an expert and help you optimize your use of regular expressions.Mastering Regular Expressions, Third Edition, now includes a full chapter devoted to PHP and its powerful and expressive suite of regular expression functions, in addition to enhanced PHP coverage in the central "core" chapters. Furthermore, this edition has been updated throughout to reflect advances in other languages, including expanded in-depth coverage of Sun's java.util.regex package, which has emerged as the standard Java regex implementation.Topics include:A comparison of features among different versions of many languages and toolsHow the regular expression engine worksOptimization (major savings available here!)Matching just what you want, but not what you don't wantSections and chapters on individual languagesWritten in the lucid, entertaining tone that makes a complex, dry topic become crystal-clear to programmers, and sprinkled with solutions to complex real-world problems, Mastering Regular Expressions, Third Edition offers a wealth information that you can put to immediateuse.Reviews of this new edition and the second edition: "There isn't a better (or more useful) book available on regular expressions."--Zak Greant, Managing Director, eZ Systems"A real tour-de-force of a book which not only covers the mechanics of regexes in extraordinary detail but also talks about efficiency and the use of regexes in Perl, Java, and .NET...If you use regular expressions as part of your professional work (even if you already have a good book on whatever language you're programming in) I would strongly recommend this book to you."--Dr. Chris Brown, Linux Format"The author does an outstanding job leading the reader from regexnovice to master. The book is extremely easy to read and chock full ofuseful and relevant examples...Regular expressions are valuable toolsthat every developer should have in their toolbox. Mastering RegularExpressions is the definitive guide to the subject, and an outstandingresource that belongs on every programmer's bookshelf. Ten out of TenHorseshoes."--Jason Menard, Java Ranch

The Man Who Saved the V-8: The Untold Stories of Some of the Most Important Product Decisions in the History of Ford Motor Company


Chase Morsey Jr. - 2014
    joins Ford Motor Co. in 1948, he has no idea the part he'll play in automotive history. Morsey's arrival comes as Henry Ford II and other titans in the industry are about to kill the vaunted V-8 engine. He sees it as his sole mission to talk them out of it. In The Man Who Saved the V-8, he shares the never-before-told story of how his crusade saved the engine that would go on to power iconic cars like the Ford Thunderbird and Mustang. "To this day, I have no idea how a young, newly hired manager like myself...had the nerve to challenge the most powerful men inside Ford Motor Company and tell them they were wrong," Morsey says. "But that is exactly what I did." The twenty-nine-year-old executive embarks on massive market research. He works with manufacturing experts to find ways to produce the V-8 engine more efficiently. After finding success, he goes on to continue playing a central role in some of the most pivotal decisions that would ensure Ford remains one of the powerhouses in the automotive industry. The Man Who Saved the V-8 tells the story of his successes and lessons learned.

The Filter Bubble: What the Internet is Hiding From You


Eli Pariser - 2011
    Instead of giving you the most broadly popular result, Google now tries to predict what you are most likely to click on. According to MoveOn.org board president Eli Pariser, Google's change in policy is symptomatic of the most significant shift to take place on the Web in recent years - the rise of personalization. In this groundbreaking investigation of the new hidden Web, Pariser uncovers how this growing trend threatens to control how we consume and share information as a society-and reveals what we can do about it.Though the phenomenon has gone largely undetected until now, personalized filters are sweeping the Web, creating individual universes of information for each of us. Facebook - the primary news source for an increasing number of Americans - prioritizes the links it believes will appeal to you so that if you are a liberal, you can expect to see only progressive links. Even an old-media bastion like "The Washington Post" devotes the top of its home page to a news feed with the links your Facebook friends are sharing. Behind the scenes a burgeoning industry of data companies is tracking your personal information to sell to advertisers, from your political leanings to the color you painted your living room to the hiking boots you just browsed on Zappos.In a personalized world, we will increasingly be typed and fed only news that is pleasant, familiar, and confirms our beliefs - and because these filters are invisible, we won't know what is being hidden from us. Our past interests will determine what we are exposed to in the future, leaving less room for the unexpected encounters that spark creativity, innovation, and the democratic exchange of ideas.While we all worry that the Internet is eroding privacy or shrinking our attention spans, Pariser uncovers a more pernicious and far-reaching trend on the Internet and shows how we can - and must - change course. With vivid detail and remarkable scope, The Filter Bubble reveals how personalization undermines the Internet's original purpose as an open platform for the spread of ideas and could leave us all in an isolated, echoing world.

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data


Ralph Kimball - 2004
    Delivers real-world solutions for the most time- and labor-intensive portion of data warehousing-data staging, or the extract, transform, load (ETL) process. Delineates best practices for extracting data from scattered sources, removing redundant and inaccurate data, transforming the remaining data into correctly formatted data structures, and then loading the end product into the data warehouse. Offers proven time-saving ETL techniques, comprehensive guidance on building dimensional structures, and crucial advice on ensuring data quality.

Naked Statistics: Stripping the Dread from the Data


Charles Wheelan - 2012
    How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.

Introduction to Computation and Programming Using Python


John V. Guttag - 2013
    It provides students with skills that will enable them to make productive use of computational techniques, including some of the tools and techniques of "data science" for using computation to model and interpret data. The book is based on an MIT course (which became the most popular course offered through MIT's OpenCourseWare) and was developed for use not only in a conventional classroom but in in a massive open online course (or MOOC) offered by the pioneering MIT--Harvard collaboration edX.Students are introduced to Python and the basics of programming in the context of such computational concepts and techniques as exhaustive enumeration, bisection search, and efficient approximation algorithms. The book does not require knowledge of mathematics beyond high school algebra, but does assume that readers are comfortable with rigorous thinking and not intimidated by mathematical concepts. Although it covers such traditional topics as computational complexity and simple algorithms, the book focuses on a wide range of topics not found in most introductory texts, including information visualization, simulations to model randomness, computational techniques to understand data, and statistical techniques that inform (and misinform) as well as two related but relatively advanced topics: optimization problems and dynamic programming.Introduction to Computation and Programming Using Python can serve as a stepping-stone to more advanced computer science courses, or as a basic grounding in computational problem solving for students in other disciplines.

Getting Started with SQL: A Hands-On Approach for Beginners


Thomas Nield - 2016
    If you're a business or IT professional, this short hands-on guide teaches you how to pull and transform data with SQL in significant ways. You will quickly master the fundamentals of SQL and learn how to create your own databases.Author Thomas Nield provides exercises throughout the book to help you practice your newfound SQL skills at home, without having to use a database server environment. Not only will you learn how to use key SQL statements to find and manipulate your data, but you'll also discover how to efficiently design and manage databases to meet your needs.You'll also learn how to:Explore relational databases, including lightweight and centralized modelsUse SQLite and SQLiteStudio to create lightweight databases in minutesQuery and transform data in meaningful ways by using SELECT, WHERE, GROUP BY, and ORDER BYJoin tables to get a more complete view of your business dataBuild your own tables and centralized databases by using normalized design principlesManage data by learning how to INSERT, DELETE, and UPDATE records

Preparing to Teach in the Lifelong Learning Sector


Ann Gravells - 2008
    This includes further education, adult and community learning, work-based learning, the forces and offender learning and skills. It is easy to read with plenty of practical activities and examples throughout and the content is fully linked to the Teacher Training Standards. Please note: This book has since been updated to reflect the new title of the qualification: The Award in Education and Training.The qualification unit content contained in the appendices has since changed, and some legislation mentioned in the book has been updated.

Reinforcement Learning: An Introduction


Richard S. Sutton - 1998
    Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications.Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

Coders: The Making of a New Tribe and the Remaking of the World


Clive Thompson - 2019
    And this may sound weirdly obvious, but every single one of those pieces of software was written by a programmer. Programmers are thus among the most quietly influential people on the planet. As we live in a world made of software, they're the architects. The decisions they make guide our behavior. When they make something newly easy to do, we do a lot more of it. If they make it hard or impossible to do something, we do less of it.If we want to understand how today's world works, we ought to understand something about coders. Who exactly are the people that are building today's world? What makes them tick? What type of personality is drawn to writing software? And perhaps most interestingly -- what does it do to them?One of the first pieces of coding a newbie learns is the program to make the computer say "Hello, world!" Like that piece of code, Clive Thompson's book is a delightful place to begin to understand this vocation, which is both a profession and a way of life, and which essentially didn't exist little more than a generation ago, but now is considered just about the only safe bet we can make about what the future holds. Thompson takes us close to some of the great coders of our time, and unpacks the surprising history of the field, beginning with the first great coders, who were women. Ironically, if we're going to traffic in stereotypes, women are arguably "naturally" better at coding than men, but they were written out of the history, and shoved out of the seats, for reasons that are illuminating. Now programming is indeed, if not a pure brotopia, at least an awfully homogenous community, which attracts people from a very narrow band of backgrounds and personality types. As Thompson learns, the consequences of that are significant - not least being a fetish for disruption at scale that doesn't leave much time for pondering larger moral issues of collateral damage. At the same time, coding is a marvelous new art form that has improved the world in innumerable ways, and Thompson reckons deeply, as no one before him has, with what great coding in fact looks like, who creates it, and where they come from. To get as close to his subject has he can, he picks up the thread of his own long-abandoned coding practice, and tries his mightiest to up his game, with some surprising results.More and more, any serious engagement with the world demands an engagement with code and its consequences, and to understand code, we must understand coders. In that regard, Clive Thompson's Hello, World! is a marvelous and delightful master class.

Footballistics


James Coventry - 2018
    The nature of football continually changes, which means its analysis must also keep pace. This book is for students, thinkers, and theorists of the game.'Ted Hopkins - Carlton premiership player, author, and co-founder of Champion Data. Australian Rules football has been described as the most data-rich sport on Earth. Every time and everywhere an AFL side takes to the field, it is shadowed by an army of statisticians and number crunchers. The information they gather has become the sport's new language and currency. ABC journalist James Coventry, author of the acclaimed Time and Space, has joined forces with a group of razor-sharp analysts to decipher the data, and to use it to question some of football's long-held truisms. Do umpires really favour the home side? Has goal kicking accuracy deteriorated? Is Geelong the true master of the draft? Are blonds unfairly favoured in Brownlow medal voting? And are Victorians the most passionate fans? Through a blend of entertaining storytelling and expert analysis, this book will answer more questions about footy than you ever thought to ask. Praise for Time and Space:'Brilliant, masterful' - The Guardian'Arguably one of the most important books yet written on Australian Rules football.' - Inside History'Should find its way into the hands of every coach.' - AFL Record

Microsoft Excel Data Analysis and Business Modeling


Wayne L. Winston - 2004
    For more than a decade, well-known consultant and business professor Wayne Winston has been teaching corporate clients and MBA students the most effective ways to use Microsoft Excel for data analysis, modeling, and decision making. Now this award-winning educator shares the best of his classroom experience in this practical, business-focused guide. Each chapter advances your data analysis and modeling expertise using real-world examples and learn-by-doing exercises. You also get all the book’s problem-and-solution files on CD—for all the practice you need to solve complex problems and work smarter with Excel.Learn how to solve real business problems with Excel!Create best, worst, and most-likely scenarios for sales Calculate how long it would take to recoup a project’s startup costs Plan personal finances, such as computing loan terms or saving for retirement Estimate a product’s demand curve Simulate stock performance over a year Determine which product mix will yield the greatest profits Interpret the effects of price and advertising on sales Assign a dollar value to customer loyalty Manage inventory and order quantities with precision Create customer service queues with short wait times Estimate the probabilities of equipment failure Model business uncertainties Get new perspectives on data with PivotTable dynamic views Help predict quarterly revenue, outcomes of sporting events, presidential elections, and more! On the CD:Practice files for all the book’s exercises Solutions for problem sets Fully searchable eBook A Note Regarding the CD or DVDThe print version of this book ships with a CD or DVD. For those customers purchasing one of the digital formats in which this book is available, we are pleased to offer the CD/DVD content as a free download via O'Reilly Media's Digital Distribution services. To download this content, please visit O'Reilly's web site, search for the title of this book to find its catalog page, and click on the link below the cover image (Examples, Companion Content, or Practice Files). Note that while we provide as much of the media content as we are able via free download, we are sometimes limited by licensing restrictions. Please direct any questions or concerns to booktech@oreilly.com.

Deep Learning with Python


François Chollet - 2017
    It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.

97 Things Every Programmer Should Know: Collective Wisdom from the Experts


Kevlin Henney - 2010
    With the 97 short and extremely useful tips for programmers in this book, you'll expand your skills by adopting new approaches to old problems, learning appropriate best practices, and honing your craft through sound advice.With contributions from some of the most experienced and respected practitioners in the industry--including Michael Feathers, Pete Goodliffe, Diomidis Spinellis, Cay Horstmann, Verity Stob, and many more--this book contains practical knowledge and principles that you can apply to all kinds of projects.A few of the 97 things you should know:"Code in the Language of the Domain" by Dan North"Write Tests for People" by Gerard Meszaros"Convenience Is Not an -ility" by Gregor Hohpe"Know Your IDE" by Heinz Kabutz"A Message to the Future" by Linda Rising"The Boy Scout Rule" by Robert C. Martin (Uncle Bob)"Beware the Share" by Udi Dahan