Configuring Spark Applications
- Pass properties using the --conf command-line switch; for example:
spark-submit \ --class com.cloudera.example.YarnExample \ --master yarn \ --deploy-mode cluster \ --conf "spark.eventLog.dir=hdfs:///user/spark/eventlog" \ lib/yarn-example.jar \ 10
- Specify properties in spark-defaults.conf. See Configuring Spark Application Properties in spark-defaults.conf.
- Pass properties directly to the SparkConf used to create the SparkContext in your Spark application; for example:
- Scala:
val conf = new SparkConf().set("spark.dynamicAllocation.initialExecutors", "5") val sc = new SparkContext(conf)
- Python:
from pyspark import SparkConf, SparkContext from pyspark.sql import SQLContext conf = (SparkConf().setAppName('Application name')) conf.set('spark.hadoop.avro.mapred.ignore.inputs.without.extension', 'false') sc = SparkContext(conf = conf) sqlContext = SQLContext(sc)
- Scala:
- Properties passed to SparkConf.
- Arguments passed to spark-submit, spark-shell, or pyspark.
- Properties set in spark-defaults.conf.
For more information, see Spark Configuration.
Configuring Spark Application Properties in spark-defaults.conf
Specify properties in the spark-defaults.conf file in the form property value.
You create a comment by adding a hash mark ( # ) at the beginning of a line. You cannot add comments to the end or middle of a line.
spark.master spark://mysparkmaster.acme.com:7077 spark.eventLog.enabled true spark.eventLog.dir hdfs:///user/spark/eventlog # Set spark executor memory spark.executor.memory 2g spark.logConf trueCloudera recommends placing configuration properties that you want to use for every application in spark-defaults.conf. See Application Properties for more information.
Configuring Properties in spark-defaults.conf Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
Configure properties for all Spark applications in spark-defaults.conf as follows:
- Go to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Locate the Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-defaults.conf property.
- Specify properties described in Application Properties.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager.
- Click Save Changes to commit the changes.
- Deploy the client configuration.
Configuring Properties in spark-defaults.conf Using the Command Line
- Follow these command-line instructions on systems that do not use Cloudera Manager.
- This information applies specifically to CDH 5.15.0. See Cloudera Documentation for information specific to other releases.
To configure properties for all Spark applications using the command line, edit the file SPARK_HOME/conf/spark-defaults.conf.
Configuring Spark Application Logging Properties
You configure Spark application logging properties in a log4j.properties file.
Configuring Logging Properties Using Cloudera Manager
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
To configure only the logging threshold level, follow the procedure in Configuring Logging Thresholds. To configure any other logging property, do the following:
- Go to the Spark service.
- Click the Configuration tab.
- Select .
- Select .
- Locate the Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/log4j.properties property.
- Specify log4j properties.
If more than one role group applies to this configuration, edit the value for the appropriate role group. See Modifying Configuration Properties Using Cloudera Manager .
- Click Save Changes to commit the changes.
- Deploy the client configuration.
Configuring Logging Properties Using the Command Line
- Follow these command-line instructions on systems that do not use Cloudera Manager.
- This information applies specifically to CDH 5.15.0. See Cloudera Documentation for information specific to other releases.
To specify logging properties for all users on a machine by using the command line, edit the file SPARK_HOME/conf/log4j.properties. To set it just for yourself or for a specific application, copy SPARK_HOME/conf/log4j.properties.template to log4j.properties in your working directory or any directory in your application's classpath.
<< Building Spark Applications | ©2016 Cloudera, Inc. All rights reserved | Running Spark Applications >> |
Terms and Conditions Privacy Policy |