Rough Cuts are manuscripts that are developed but not yet published, available through Safari. Rough Cuts provide you access to the very latest information on a given topic and offer you the opportunity to interact with the author to influence the final publication.
Also available in other formats.
This is the Rough Cut version of the printed book.
As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now Practical Data Science with Hadoop(R) and Spark provides a complete and up-to-date guide to data science with Hadoop: high-level concepts, deep-dive techniques, practical applications, hands-on tutorials, and real-world use cases. Drawing on their immense experience with Hadoop in enterprise Big Data environments, this book's authors bring together all the practical knowledge you'll need to do real, useful data science with Hadoop. Coverage includes
About the Authors
Part I: Data Science with Hadoop—An Overview
Chapter 1: Introduction to Data Science
Chapter 2: Use Cases for Data Science
Chapter 3: Hadoop and Data Science
Part II: Preparing and Visualizing Data with Hadoop
Chapter 4: Getting the Data into Hadoop
Chapter 5: Data Munging with Hadoop
Chapter 6: Exploring and Visualizing Data
Part III: Applying Data Modeling with Hadoop
Chapter 7: Machine Learning with Hadoop
Chapter 8: Predictive Modeling
Chapter 9: Clustering
Chapter 10: Anomaly Detection with Hadoop
Chapter 11: Natural Language Processing
Chapter 12: Data Science—The Next Frontier
Appendix A: Book Webpage and Code Download
Appendix B: HDFS Quick Start
Appendix C: Additional Background on Data Science and Apache Hadoop