Home > Store

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS, Rough Cuts

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS, Rough Cuts

Premium Item

  • Sorry, this book is no longer in print.
Not for Sale

Also available in other formats.

Register your product to gain access to bonus material or receive a coupon.


  • Copyright 2017
  • Edition: 1st
  • Premium Item
  • ISBN-10: 0-13-459812-1
  • ISBN-13: 978-0-13-459812-3

This is the Rough Cut version of the printed book.

Stop searching the web for out-of-date, fragmentary, and unreliable information about running Hadoop! Now, there's a single source for all the authoritative knowledge and trustworthy procedures you need: Expert Hadoop® Administration: Managing Spark, YARN, and HDFS.

Pioneering Hadoop/Big Data administrator Sam R. Alapati shares step-by-step procedures for confidently performing every important task involved in creating, configuring, securing, managing, and optimizing production Hadoop clusters. The only Hadoop administration guide written by a working Hadoop administrator, Expert Hadoop® Administration covers an unmatched range of topics and offers an unparalleled collection of realistic examples. Alapati shares proven answers to complex configuration, management, and performance-tuning problems Hadoop administrators constantly encounter, and expert guidance for customizing Hadoop 2's intensely complex environment. Throughout, he integrates action-oriented advice with carefully researched explanations of both problems and solutions. Coverage includes

  • Indispensable Hadoop concepts, including architecture, clusters, and application frameworks
  • Configuring high-reliability, high-performance Hadoop environments
  • Managing and protecting Hadoop data and high availability, including HDFS management, compression, data formats, and NameNode
  • Moving data, allocating resources, and scheduling jobs with YARN, and managing job workflows with Oozie and Hue
  • Hadoop security, monitoring, logging, and benchmarking
  • Troubleshooting root causes of severe performance slowdowns
  • Preventing trouble by proactively maintaining healthy Hadoop environments
  • Installing Hadoop virtual environments, and more

Sample Content

Table of Contents

Part I: Introduction to Hadoop 2—Architecture and Hadoop Clusters

Chapter 1: Introduction to Hadoop 2 and Its Environment

Chapter 2: An Introduction to the Architecture of Hadoop 2

Chapter 3: Creating and Configuring a Simple Hadoop 2 Cluster

Chapter 4: Planning for and Creating a Fully Distributed Cluster

Part II: Hadoop Application Frameworks

Chapter 5: Running Applications in a Cluster—The MapReduce Framework (and Pig, Hive)

Chapter 6: Running Applications in a Cluster—The Spark Framework

Chapter 7: Running Applications in a Cluster—The Spark Framework (Second Part)

Part III: Managing and Protecting Hadoop Data and High Availability

Chapter 8: The Role of the NameNode and How HDFS Works

Chapter 9: HDFS Commands, File Permissions, and HDFS Storage Management

Chapter 10: Data Protection, Compression, and Hadoop Data Formats

Chapter 11: NameNode Operations and High Availability

Part IV: Moving Data, Allocating Resources, Scheduling Jobs, and Security

Chapter 12: Moving Data Into and Out of Hadoop

Chapter 13: YARN, and Resource Allocation in a Hadoop Cluster

Chapter 14: Working with Oozie and Hue to Manage Workflows

Chapter 15: Securing Hadoop

Part V: Monitoring, Optimization, and Troubleshooting

Chapter 16: Managing Jobs, Using Hue, and Performing Routine Tasks

Chapter 17: Monitoring, Metrics, and Hadoop Logging

Chapter 18: Bechmarking, Optimization, and Performance Tuning

Chapter 19: Configuring and Tuning Apache Spark on YARN

Chapter 20: Optimizing Spark Applications

Chapter 21: Troubleshooting Hadoop—A Sampler


Submit Errata

More Information

Unlimited one-month access with your purchase
Free Safari Membership