The workshop contains quiz questions and exercises to help you solidify your understanding of the material covered. Try to answer all questions before looking at the “Answers” section that follows.
True or false: A Spark Standalone cluster consists of a single node.
Which component is not a prerequisite for installing Spark?
Which of the following subdirectories contained in the Spark installation contains scripts to start and stop master and slave node Spark services?
Which of the following environment variables are required to run Spark on Hadoop/YARN?
Either HADOOP_CONF_DIR or YARN_CONF_DIR will work.
False. Standalone refers to the independent process scheduler for Spark, which could be deployed on a cluster of one-to-many nodes.
A. The Scala assembly is included with Spark; however, Java and Python must exist on the system prior to installation.
B. sbin contains administrative scripts to start and stop Spark services.
C. Either the HADOOP_CONF_DIR or YARN_CONF_DIR environment variable must be set for Spark to use YARN.