Home > Store

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS

Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN, and HDFS

eBook (Watermarked)

  • Your Price: $31.99
  • List Price: $39.99
  • Estimated Release: Nov 29, 2016
  • Includes EPUB, MOBI, and PDF
  • About eBook Formats
  • This eBook includes the following formats, accessible from your Account page after purchase:

    ePub EPUB The open industry format known for its reflowable content and usability on supported mobile devices.

    MOBI MOBI The eBook format compatible with the Amazon Kindle and Amazon Kindle applications.

    Adobe Reader PDF The popular standard, used most often with the free Adobe® Reader® software.

    This eBook requires no passwords or activation to read. We customize your eBook by discreetly watermarking it with your name, making it uniquely yours.

Also available in other formats.

Register your product to gain access to bonus material or receive a coupon.


  • Copyright 2017
  • Dimensions: 7" x 9-1/8"
  • Pages: 832
  • Edition: 1st
  • eBook (Watermarked)
  • ISBN-10: 0-13-459813-X
  • ISBN-13: 978-0-13-459813-0

The Comprehensive, Up-to-Date Apache Hadoop Administration Handbook and Reference

In Expert Hadoop® Administration, leading Hadoop administrator Sam R. Alapati brings together authoritative knowledge for creating, configuring, securing, managing, and optimizing production Hadoop clusters in any environment. Drawing on his experience with large-scale Hadoop administration, Alapati integrates action-oriented advice with carefully researched explanations of both problems and solutions. He covers an unmatched range of topics and offers an unparalleled collection of realistic examples.

Alapati demystifies complex Hadoop environments, helping you understand exactly what happens behind the scenes when you administer your cluster. You’ll gain unprecedented insight as you walk through building clusters from scratch and configuring high availability, performance, security, encryption, and other key attributes. The high-value administration skills you learn here will be indispensable no matter what Hadoop distribution you use or what Hadoop applications you run.

Coverage includes

  • Understanding Hadoop’s architecture from an administrator’s standpoint
  • Creating simple and fully distributed clusters
  • Running MapReduce and Spark applications in a Hadoop cluster
  • Managing and protecting Hadoop data and high availability
  • Working with HDFS commands, file permissions, and storage management
  • Moving data, and using YARN to allocate resources and schedule jobs
  • Managing job workflows with Oozie and Hue
  • Securing, monitoring, logging, and optimizing Hadoop
  • Benchmarking, optimizing, and troubleshooting Hadoop

Sample Content

Table of Contents




About the Author

Part I: Introduction to Hadoop 2—Architecture and Hadoop Clusters

Chapter 1: Introduction to Hadoop 2 and Its Environment

Chapter 2: An Introduction to the Architecture of Hadoop 2

Chapter 3: Creating and Configuring a Simple Hadoop 2 Cluster

Chapter 4: Planning for and Creating a Fully Distributed Cluster

Part II: Hadoop Application Frameworks

Chapter 5: Running Applications in a Cluster—The MapReduce Framework (and Pig, Hive)

Chapter 6: Running Applications in a Cluster—The Spark Framework

Chapter 7: Running Applications in a Cluster—The Spark Framework (Second Part)

Part III: Managing and Protecting Hadoop Data and High Availability

Chapter 8: The Role of the NameNode and How HDFS Works

Chapter 9: HDFS Commands, File Permissions, and HDFS Storage Management

Chapter 10: Data Protection, Compression, and Hadoop Data Formats

Chapter 11: NameNode Operations and High Availability

Part IV: Moving Data, Allocating Resources, Scheduling Jobs, and Security

Chapter 12: Moving Data Into and Out of Hadoop

Chapter 13: YARN, and Resource Allocation in a Hadoop Cluster

Chapter 14: Working with Oozie and Hue to Manage Workflows

Chapter 15: Securing Hadoop

Part V: Monitoring, Optimization, and Troubleshooting

Chapter 16: Managing Jobs, Using Hue, and Performing Routine Tasks

Chapter 17: Monitoring, Metrics, and Hadoop Logging

Chapter 18: Bechmarking, Optimization, and Performance Tuning

Chapter 19: Configuring and Tuning Apache Spark on YARN

Chapter 20: Optimizing Spark Applications

Chapter 21: Troubleshooting Hadoop—A Sampler



Submit Errata

More Information

Unlimited one-month access with your purchase
Free Safari Membership