Home > Store

Practical Data Science with Hadoop and Spark: Designing and Building Effective Analytics at Scale, Rough Cuts

Premium Item

  • Sorry, this book is no longer in print.
Not for Sale

Also available in other formats.

Register your product to gain access to bonus material or receive a coupon.

Description

  • Copyright 2017
  • Edition: 1st
  • Premium Item
  • ISBN-10: 0-13-402977-1
  • ISBN-13: 978-0-13-402977-1

This is the Rough Cut version of the printed book.


As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark provides a complete and up-to-date guide to data science with Hadoop: high-level concepts, deep-dive techniques, practical applications, hands-on tutorials, and real-world use cases. Drawing on their immense experience with Hadoop in enterprise Big Data environments, this book's authors bring together all the practical knowledge you'll need to do real, useful data science with Hadoop. Coverage includes

  • What data science is, what data scientists do, and how to build or join a data science team
  • Core data science applications in retail, healthcare, insurance, banking, education, and beyond
  • How Hadoop has evolved into an outstanding environment for doing data science
  • A day in the life of a data scientist: exploration, iteration, and more
  • Getting your data into Hadoop: data lakes, Sqoop, Flume, Falcon, and more
  • Preparing your data, from start to finish
  • Data modeling and machine learning
  • Visualization: how (and how not) to use it
  • Start-to-finish case studies: recommender systems, customer segmentation, sentiment analysis, and predictive risk modeling
  • The future: Storm online scoring, GIRAPH graph algorithms, Solr/Elastic search, and more

Sample Content

Table of Contents

Foreword

Preface

Acknowledgments

About the Authors

Part I: Data Science with Hadoop—An Overview

Chapter 1: Introduction to Data Science

Chapter 2: Use Cases for Data Science

Chapter 3: Hadoop and Data Science

Part II: Preparing and Visualizing Data with Hadoop

Chapter 4: Getting the Data into Hadoop

Chapter 5: Data Munging with Hadoop

Chapter 6: Exploring and Visualizing Data

Part III: Applying Data Modeling with Hadoop

Chapter 7: Machine Learning with Hadoop

Chapter 8: Predictive Modeling

Chapter 9: Clustering

Chapter 10: Anomaly Detection with Hadoop

Chapter 11: Natural Language Processing

Chapter 12: Data Science—The Next Frontier

Appendix A: Book Webpage and Code Download

Appendix B: HDFS Quick Start

Appendix C: Additional Background on Data Science and Apache Hadoop

Index

Updates

Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.