How Applications are Executed on a Spark Cluster

By Jeffrey Aven
Sep 3, 2018

📄 Contents

␡

⎙ Print

< Back Page 4 of 4

This chapter is from the book 

Data Analytics with Spark Using Python

Learn More Buy

Summary

In this chapter, you have learned about the Spark runtime application and cluster architecture, the components or a Spark application, and the functions of these components. The components of a Spark application include the Driver, Master, Cluster Manager, and Executors. The Driver is the process that the client interacts with when launching a Spark application, either through one of the interactive shells or through the spark-submit script. The Driver is responsible for creating the SparkSession object (the entry point for any Spark application) and planning an application by creating a DAG consisting of tasks and stages. The Driver communicates with a Master, which in turn communicates with a Cluster Manager to allocate application runtime resources (containers) on which Executors will run. Executors are specific to a given application and run all tasks for the application; they also store output data from completed tasks. Spark’s runtime architecture is essentially the same regardless of the cluster resource scheduler used (Standalone, YARN, Mesos, and so on).

Now that we have explored Spark’s cluster architecture, it’s time to put the concepts into action starting in the next chapter.

< Back Page 4 of 4

🔖 Save To Your Account

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Email Address

How Applications are Executed on a Spark Cluster

This chapter is from the book

This chapter is from the book

This chapter is from the book 

Summary

InformIT Promotional Mailings & Special Offers