Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites
Matthew A. Russell - 2011
You’ll learn how to combine social web data, analysis techniques, and visualization to find what you’ve been looking for in the social haystack—as well as useful information you didn’t know existed.Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.Get a straightforward synopsis of the social web landscapeUse adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, LinkedIn, and Google+Learn how to employ easy-to-use Python tools to slice and dice the data you collectExplore social connections in microformats with the XHTML Friends NetworkApply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detectionBuild interactive visualizations with web technologies based upon HTML5 and JavaScript toolkits"A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data." --Alex Martelli, Senior Staff Engineer, Google
Data Analytics Made Accessible
Anil Maheshwari - 2014
It is a conversational book that feels easy and informative. This short and lucid book covers everything important, with concrete examples, and invites the reader to join this field. The chapters in the book are organized for a typical one-semester course. The book contains case-lets from real-world stories at the beginning of every chapter. There is a running case study across the chapters as exercises. This book is designed to provide a student with the intuition behind this evolving area, along with a solid toolset of the major data mining techniques and platforms. Students across a variety of academic disciplines, including business, computer science, statistics, engineering, and others are attracted to the idea of discovering new insights and ideas from data. This book can also be gainfully used by executives, managers, analysts, professors, doctors, accountants, and other professionals to learn how to make sense of the data coming their way. This is a lucid flowing book that one can finish in one sitting, or can return to it again and again for insights and techniques. Table of Contents Chapter 1: Wholeness of Business Intelligence and Data Mining Chapter 2: Business Intelligence Concepts & Applications Chapter 3: Data Warehousing Chapter 4: Data Mining Chapter 5: Decision Trees Chapter 6: Regression Models Chapter 7: Artificial Neural Networks Chapter 8: Cluster Analysis Chapter 9: Association Rule Mining Chapter 10: Text Mining Chapter 11: Web Mining Chapter 12: Big Data Chapter 13: Data Modeling Primer Appendix: Data Mining Tutorial using Weka
Excel 2013 Bible
John Walkenbach - 2013
Known as Mr. Spreadsheet, Walkenbach shows you how to maximize the power of Excel 2013 while bringing you up to speed on the latest features. This perennial bestseller is fully updated to cover all the new features of Excel 2013, including how to navigate the user interface, take advantage of various file formats, master formulas, analyze data with PivotTables, and more.Whether you're an Excel beginner who is looking to get more savvy or an advanced user looking to become a power user, this latest edition provides you with comprehensive coverage as well as helpful tips, tricks, and techniques that you won't find anywhere else.Shares the invaluable insight of Excel guru and bestselling author Mr. Spreadsheet John Walkenbach as he guides you through every aspect of Excel 2013 Provides essential coverage of all the newest features of Excel 2013 Presents material in a clear, concise, logical format that is ideal for all levels of Excel experience Features a website that includes downloadable templates and worksheets from the book Chart your path to fantastic formulas and stellar spreadsheets with Excel 2013 Bible!
Programming Windows 8 Apps with HTML, CSS, and JavaScript
Kraig Brockschmidt - 2012
Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema
Lawrence Corr - 2011
This book describes BEAM✲, an agile approach to dimensional modeling, for improving communication between data warehouse designers, BI stakeholders and the whole DW/BI development team. BEAM✲ provides tools and techniques that will encourage DW/BI designers and developers to move away from their keyboards and entity relationship based tools and model interactively with their colleagues. The result is everyone thinks dimensionally from the outset! Developers understand how to efficiently implement dimensional modeling solutions. Business stakeholders feel ownership of the data warehouse they have created, and can already imagine how they will use it to answer their business questions. Within this book, you will learn: ✲ Agile dimensional modeling using Business Event Analysis & Modeling (BEAM✲) ✲ Modelstorming: data modeling that is quicker, more inclusive, more productive, and frankly more fun! ✲ Telling dimensional data stories using the 7Ws (who, what, when, where, how many, why and how) ✲ Modeling by example not abstraction; using data story themes, not crow's feet, to describe detail ✲ Storyboarding the data warehouse to discover conformed dimensions and plan iterative development ✲ Visual modeling: sketching timelines, charts and grids to model complex process measurement - simply ✲ Agile design documentation: enhancing star schemas with BEAM✲ dimensional shorthand notation ✲ Solving difficult DW/BI performance and usability problems with proven dimensional design patterns Lawrence Corr is a data warehouse designer and educator. As Principal of DecisionOne Consulting, he helps clients to review and simplify their data warehouse designs, and advises vendors on visual data modeling techniques. He regularly teaches agile dimensional modeling courses worldwide and has taught dimensional DW/BI skills to thousands of students. Jim Stagnitto is a data warehouse and master data management architect specializing in the healthcare, financial services, and information service industries. He is the founder of the data warehousing and data mining consulting firm Llumino.
Pattern Recognition and Machine Learning
Christopher M. Bishop - 2006
However, these activities can be viewed as two facets of the same field, and together they have undergone substantial development over the past ten years. In particular, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic models. Also, the practical applicability of Bayesian methods has been greatly enhanced through the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation. Similarly, new models based on kernels have had a significant impact on both algorithms and applications. This new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
Excel Dashboards & Reports
Michael Alexander - 2010
Offering a comprehensive review of a wide array of technical and analytical concepts, Excel Reports and Dashboards helps Excel users go from reporting data with simple tables full of dull numbers, to presenting key information through the use of high-impact, meaningful reports and dashboards that will wow management both visually and substantively.Details how to analyze large amounts of data and report the results in a meaningful, eye-catching visualization Describes how to use different perspectives to achieve better visibility into data, as well as how to slice data into various views on the fly Shows how to automate redundant reporting and analyses Part technical manual, part analytical guidebook, Excel Dashboards and Reports is the latest addition to the Mr. Spreadsheet's Bookshelf series and is the leading resource for learning to create dashboard reports in an easy-to-use format that's both visually attractive and effective.
Deep Learning with Python
François Chollet - 2017
It is the technology behind photo tagging systems at Facebook and Google, self-driving cars, speech recognition systems on your smartphone, and much more.In particular, Deep learning excels at solving machine perception problems: understanding the content of image data, video data, or sound data. Here's a simple example: say you have a large collection of images, and that you want tags associated with each image, for example, "dog," "cat," etc. Deep learning can allow you to create a system that understands how to map such tags to images, learning only from examples. This system can then be applied to new images, automating the task of photo tagging. A deep learning model only has to be fed examples of a task to start generating useful results on new data.
Training Kit (Exam 70-461): Querying Microsoft SQL Server 2012
Itzik Ben-Gan - 2012
Work at your own pace through a series of lessons and practical exercises, and then assess your skills with practice tests on CD—featuring multiple, customizable testing options.Maximize your performance on the exam by learning how to:Create database objectsWork with dataModify dataTroubleshoot and optimize queriesYou also get an exam discount voucher—making this book an exceptional value and a great career investment.
Designing Data-Intensive Applications
Martin Kleppmann - 2015
Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?In this practical and comprehensive guide, author Martin Kleppmann helps you navigate this diverse landscape by examining the pros and cons of various technologies for processing and storing data. Software keeps changing, but the fundamental principles remain the same. With this book, software engineers and architects will learn how to apply those ideas in practice, and how to make full use of data in modern applications. Peer under the hood of the systems you already use, and learn how to use and operate them more effectively Make informed decisions by identifying the strengths and weaknesses of different tools Navigate the trade-offs around consistency, scalability, fault tolerance, and complexity Understand the distributed systems research upon which modern databases are built Peek behind the scenes of major online services, and learn from their architectures
Windows Presentation Foundation Unleashed
Adam Nathan - 2006
Windows Presentation Foundation (WPF) is a key component of the .NET Framework 3.0, giving you the power to create richer and more compelling applications than you dreamed possible. Whether you want to develop traditional user interfaces or integrate 3D graphics, audio/video, animation, dynamic skinning, rich document support, speech recognition, or more, WPF enables you to do so in a seamless, resolution-independent manner. Windows Presentation Foundation Unleashed is the authoritative book that covers it all, in a practical and approachable fashion, authored by .NET guru and Microsoft developer Adam Nathan. - Covers everything you need to know about Extensible Application Markup Language (XAML) - Examines the WPF feature areas in incredible depth: controls, layout, resources, data binding, styling, graphics, animation, and more - Features a chapter on 3D graphics by Daniel Lehenbauer, lead developer responsible for WPF 3D - Delves into non-mainstream topics: speech, audio/video, documents, bitmap effects, and more - Shows how to create popular UI elements, such as features introduced in the 2007 Microsoft Office System: Galleries, ScreenTips, custom control layouts, and more - Demonstrates how to create sophisticated UI mechanisms, such as Visual Studio-like collapsible/dockable panes - Explains how to develop and deploy all types of applications, including navigation-based applications, applications hosted in a Web browser, and applications with great-looking non-rectangular windows - Explains how to create first-class custom controls for WPF - Demonstrates how to create hybrid WPF software that leverages Windows Forms, ActiveX, or other non-WPF technologies - Explains how to exploit new Windows Vista features in WPF applications
Programming Collective Intelligence: Building Smart Web 2.0 Applications
Toby Segaran - 2002
With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:Collaborative filtering techniques that enable online retailers to recommend products or media Methods of clustering to detect groups of similar items in a large dataset Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm Optimization algorithms that search millions of possible solutions to a problem and choose the best one Bayesian filtering, used in spam filters for classifying documents based on word types and other features Using decision trees not only to make predictions, but to model the way decisions are made Predicting numerical values rather than classifications to build price models Support vector machines to match people in online dating sites Non-negative matrix factorization to find the independent features in a dataset Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a game Each chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."-- Tim Wolters, CTO, Collective Intellect
Naked Statistics: Stripping the Dread from the Data
Charles Wheelan - 2012
How can we catch schools that cheat on standardized tests? How does Netflix know which movies you’ll like? What is causing the rising incidence of autism? As best-selling author Charles Wheelan shows us in Naked Statistics, the right data and a few well-chosen statistical tools can help us answer these questions and more.For those who slept through Stats 101, this book is a lifesaver. Wheelan strips away the arcane and technical details and focuses on the underlying intuition that drives statistical analysis. He clarifies key concepts such as inference, correlation, and regression analysis, reveals how biased or careless parties can manipulate or misrepresent data, and shows us how brilliant and creative researchers are exploiting the valuable data from natural experiments to tackle thorny questions.And in Wheelan’s trademark style, there’s not a dull page in sight. You’ll encounter clever Schlitz Beer marketers leveraging basic probability, an International Sausage Festival illuminating the tenets of the central limit theorem, and a head-scratching choice from the famous game show Let’s Make a Deal—and you’ll come away with insights each time. With the wit, accessibility, and sheer fun that turned Naked Economics into a bestseller, Wheelan defies the odds yet again by bringing another essential, formerly unglamorous discipline to life.
Advanced R
Hadley Wickham - 2014
With more than ten years of experience programming in R, the author illustrates the elegance, beauty, and flexibility at the heart of R.The book develops the necessary skills to produce quality code that can be used in a variety of circumstances. You will learn:The fundamentals of R, including standard data types and functions Functional programming as a useful framework for solving wide classes of problems The positives and negatives of metaprogramming How to write fast, memory-efficient codeThis book not only helps current R users become R programmers but also shows existing programmers what's special about R. Intermediate R programmers can dive deeper into R and learn new strategies for solving diverse problems while programmers from other languages can learn the details of R and understand why R works the way it does.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
Trevor Hastie - 2001
With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.