4+ Hours of Video Instruction
Apache Hadoop is a freely available open source tool-set that enables big data analysis. This Hadoop Fundamentals LiveLessons tutorial demonstrates the core components of Hadoop including Hadoop Distriuted File Systems (HDFS) and MapReduce. In addition, the tutorial demonstrates how to use Hadoop at several levels including the native Java interface, C++ pipes, and the universal streaming program interface. Examples of how to use high level tools include the Pig scripting language and the Hive "SQL like" interface. Finally, the steps for installing Hadoop on a desktop virtual machine, in a Cloud environment, and on a local stand-alone cluster are presented. Topics covered in this tutorial apply to Apache Hadoop versions 1 and 2 (i.e., MR2 or Yarn).
Douglas Eadline, PhD, began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Dr. Eadline has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net web site in 2005, he served as Editorinchief for ClusterWorld Magazine, and was Senior HPC Editor for Linux Magazine. Currently, he is a consultant to the HPC industry and writes a monthly column in HPC Admin Magazine. Both clients and readers have recognized Dr. Eadline's ability to present a "technological value proposition" in a clear and accurate style. He has practical hands on experience in many aspects of HPC including, hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing.
Lesson 1, “Background Concepts,” covers important Hadoop and Big Data fundamentals. You learn Hadoop history and design principles along with the
introduction to the MapReduce paradigm and the components of the Hadoop ecosystem will be introduced.
Lesson 2, “Running Hadoop on a Desktop or Laptop,” shows you how to create a real Hadoop working installation in a virtual Linux sandbox. All software is freely available, can be easily installed to a desktop or laptop computer, and can be used for many of the examples in this tutorial.
Lesson 3, “The Hadoop Distributed File System” introduces you to the distributed storage system of Hadoop. In this lesson, you learn HDFS design basics, how to perform basic file operations, and how to use HDFS in programs.
Lesson 4, “Hadoop MapReduce,” presents Hadoop MapReduce in more detail using simple command line examples. You also learn how to run a Java MapReduce application on a Hadoop cluster and then learn each step of the full Hadoop MapReduce process.
Lesson 5, “Hadoop Examples,” teaches you how to write MapReduce programs in almost any language using the Streaming and Pipes interface. You also learn how to run a “grep” like Hadoop application and use some basic debugging techniques.
Lesson 6, “Higher Level Tools,” shows you how to use Pig and Hive, two high level Hadoop applications. Each lesson teaches you the various execution modes and commands needed to use the tools.
Lesson 7, “Setting Up Hadoop in the Cloud,” demonstrates the simple steps needed to start a Hadoop Cluster in the cloud using a tool called Whirr.
Lesson 8, “Setting Up Hadoop on a Local Cluster,” teaches you how to install Hadoop on a basic four node cluster. You will learn the steps needed to configure, install, start, test, and monitor a fully functional Hadoop cluster.
LiveLessons Video Training series publishes hundreds of hands-on, expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. This professional and personal technology video series features world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, IBM Press, Pearson IT Certification, Prentice Hall, Sams, and Que. Topics include: IT Certification, Programming, Web Development, Mobile Development, Home & Office Technologies, Business & Management, and more. View All LiveLessons http://www.informit.com/imprint/series_detail.aspx?ser=2185116
Download supplemental materials from the author's website:
Hadoop Fundamentals Code and Notes