For related topics, see Spark Operations.
Number of instances and cores allocated
In order to set the number of cores allocated for a job, add the following parameter keys and values in the Spark Settings field within the "advanced" job properties in the Fusion UI or the sparkConfig
object if defining a job via the Fusion API.
If spark.kubernetes.executor.request.cores
is not set (default config), then Spark will set the number of CPUs for the executor pod to be the same number as spark.executor.cores
. In that case, if spark.executor.cores
is 3, then Spark will allocate 3 CPUs for the executor pod and will run 3 tasks in parallel. To under-allocate the CPU for the executor pod and still run multiple tasks in parallel, set spark.kubernetes.executor.request.cores
to a lower value than spark.executor.cores
.
The ratio for spark.kubernetes.executor.request.cores
to spark.executor.cores
depends on the type of job: either CPU-bound or I/O-bound. Allocate more memory to the executor if more tasks are running in parallel on a single executor pod.
|
Example Value |
|
3 |
|
3 |
|
6 |
|
1 |
If these settings are left unspecified, then the job launches with a driver using one core and 3GB of memory plus two executors, each using one core with 1GB of memory.
Memory allocation
The amount of memory allocated to the driver and executors is controlled on a per-job basis using the spark.executor.memory
and spark.driver.memory
parameters in the Spark Settings section of the job definition in the Fusion UI or within the sparkConfig
object in the JSON definition of the job.
|
Example Value |
|
6g |
|
2g |