7+ Hours of Video Instruction
Apache Hadoop YARN Fundamentals LiveLessons is the first complete video training course on the basics of Apache Hadoop version 2 with YARN. The tutorial begins with MapReduce and Big Data fundamentals and moves to YARN design, installation (laptop, cluster, and cloud), administration, running applications (MapReduce2, Pig and Hive), writing new applications, and useful frameworks. Additional coverage of Ambari, Ganglia, Nagios and the Hortonworks HDP is provided.
About the Instructor
Douglas Eadline is the instructor for this essential Big Data LiveLessons. Eadline is a practitioner and writer in the Linux Cluster community and previously authored Hadoop Fundamentals LiveLessons which focused on many of the basic aspects of Hadoop processing. Eadline is a co-author of the companion book Apache Hadoop YARN, Moving Beyond MapReduce and Batch Processing with Apache Hadoop 2. Doug provides a clear and complete introduction to many of the important Apache Hadoop YARN tasks and topics. In addition to a sound background on Hadoop design, he provides several installation scenarios including a laptop, cluster, and cloud. The Ambari graphic installation tool is explained and used for both installation and administration. You will also learn how to write your own YARN applications and what is needed to run Hadoop version 1 applications. Some familiarity with Java, Hadoop MapReduce and the Hadoop Distributed File System is helpful but not necessary. All example code and Lesson notes are available for download.
Who Should Take This Course
Table of Contents
Lesson 1: Background Concepts
Apache Hadoop version 1 has become a popular tool for working with big data. One of the limitations of Hadoop, however, has been the single MapReduce computational paradigm. Apache Hadoop YARN addresses this and other issues. In this lesson you learn the fundamental differences between Hadoop version 1 and Hadoop 2 with YARN and the five clear advantages of the new YARN design.
Lesson 2: Running Hadoop YARN on Personal Systems
A production Hadoop installation, whether it be a local cluster or in the cloud, can be difficult to configure and costly to operate. This lesson presents several installation scenarios including a single laptop, a desktop, a small cluster, and the Cloud. Both Apache Hadoop source and the Hortonworks HDP Sandbox are used for local systems, and when installing in the Cloud, Apache Whirr is demonstrated. These environments can be used to try most of the examples in this tutorial.
Lesson 3: Functional Description of YARN Components
Apache Hadoop YARN introduces new components to the Hadoop ecosystem. In this lesson an explanation of what each of these components does and how they interact with each other is clearly presented. In addition, various YARN scheduler options and acomplete application life cycle walk through are explained.
Lesson 4: Apache Hadoop YARN Cluster Installation Methods
Installing Hadoop is not as hard as it used to be. In this lesson, both shell script and graphical installation methods are described. The graphics installation employs the new open source Ambari tool. In addition, a the steps to install and configure the Ganglia and Nagios cluster monitoring tools are provided.
Lesson 5: Apache Hadoop YARN Cluster Administration Methods
In this Lesson, monitoring and administering an Apache Hadoop YARN cluster are described. Similar to the installation lesson, both shell scripts and the Ambari GUI tool are presented. Several essential administration tips for Apache Hadoop YARN are also provided.
Lesson 6: Running Existing Applications with Apache Hadoop YARN
One of the successful goals of Hadoop version 2 was compatibility with version 1 applications. In this Lesson, the new MapReduce framework that runs under YARN is explained. Almost all existing applications are compatible, and any important differences are presented. In addition, job tracking using the new YARN web GUI is demonstrated.
Lesson 7: Using YARN Distributed Shell and Introduction to Applications
Apache Hadoop YARN includes an application called distributed shell that enables shell commands to be run within YARN Containers on cluster nodes. In this lesson, a distributed shell example is presented and then expanded into a blueprint for other YARN applications.
Lesson 8: Exploring Apache Hadoop YARN Application Frameworks
The Apache Hadoop YARN architecture enables non-MapReduce applications to operate on Hadoop clusters. This capability has spawned a new set of applications that can take advantage Hadoop's big data capabilities. In this lesson, some of these application frameworks and how they differ from MapReduce processing are introduced.
The LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home and Office Technologies, Business and Management, and more. View All LiveLessons http://www.informit.com/livelessons
Download supplemental materials from the author's website:Hadoop Fundamentals Code and Notes