Difference between revisions of "Submitting Array Jobs"

From UFRC
Jump to navigation Jump to search
(Created page with "Back to SLURM Job Arrays ==Submitting array jobs== A job array can be submitted simply by adding #SBATCH --array=x-y to the job script where ''x'' and ''y'' are the arra...")
 
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
Back to [[SLURM Job Arrays]]
+
Back to [[SLURM Job Arrays]] __NOTOC__
==Submitting array jobs==
+
 
 
A job array can be submitted simply by adding  
 
A job array can be submitted simply by adding  
 
  #SBATCH --array=x-y
 
  #SBATCH --array=x-y
Line 13: Line 13:
 
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.  
 
which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.  
  
===Limiting the number of tasks that run at once ===
+
==Limiting the number of tasks that run at once ==
 
To ''throttle'' a job array by keeping only a certain number of tasks active at a time use the <code>%N</code> suffix where ''N'' is the number of active tasks. For example
 
To ''throttle'' a job array by keeping only a certain number of tasks active at a time use the <code>%N</code> suffix where ''N'' is the number of active tasks. For example
 
  #SBATCH -a 1-200%5
 
  #SBATCH -a 1-200%5
Line 20: Line 20:
 
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
 
Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.
  
====Using scontrol to modify throttling of running array jobs====
+
===Using scontrol to modify throttling of running array jobs===
 
{{Note|'''Reducing''' the "ArrayTaskThrottle" count on a running job array will not affect the tasks that have already entered the "RUNNING" state.  It will only prevent new tasks from starting until the number or running tasks drops below the new lower threshold.|reminder}}
 
{{Note|'''Reducing''' the "ArrayTaskThrottle" count on a running job array will not affect the tasks that have already entered the "RUNNING" state.  It will only prevent new tasks from starting until the number or running tasks drops below the new lower threshold.|reminder}}
 
If you want to change the number of simultaneous tasks of an active job, you can use scontrol:
 
If you want to change the number of simultaneous tasks of an active job, you can use scontrol:
 +
{|cellpadding="5"
 +
|-
 +
|
 
  scontrol update ArrayTaskThrottle=<count> JobId=<jobID>
 
  scontrol update ArrayTaskThrottle=<count> JobId=<jobID>
 +
||
 
eg
 
eg
 +
||
 
  scontrol update ArrayTaskThrottle=50 JobId=12345
 
  scontrol update ArrayTaskThrottle=50 JobId=12345
 +
|}
  
 
Set ArrayTaskThrottle=0 to eliminate any limit.
 
Set ArrayTaskThrottle=0 to eliminate any limit.
  
===Naming output and error files===
+
==Naming output and error files==
  
 
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
 
SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.
Line 36: Line 42:
 
  #SBATCH --output=Array_test.%A_%a.out
 
  #SBATCH --output=Array_test.%A_%a.out
 
  #SBATCH --error=Array_test.%A_%a.error
 
  #SBATCH --error=Array_test.%A_%a.error
The error log is optional as both types of logs can be written to the 'output' log.
+
The error log is optional as both types of logs can be written to the 'output' log. Note: if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. '''Make sure to use both %A and %a''' in the log file name specification.
 
  #SBATCH --output=Array_test.%A_%a.log
 
  #SBATCH --output=Array_test.%A_%a.log
 
;Note: if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. '''Make sure to use both %A and %a''' in the log file name specification.
 

Latest revision as of 16:04, 5 May 2023

Back to SLURM Job Arrays

A job array can be submitted simply by adding

#SBATCH --array=x-y

to the job script where x and y are the array bounds. A job array can also be specified at the command line with

sbatch --array=x-y job_script.sbatch

A job array will then be created with a number of independent jobs a.k.a. array tasks that correspond to the defined array.

SLURM's job array handling is very versatile. Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in

sbatch --array=4,8,15,16,23,42  job_script.sbatch

which can be used to quickly rerun the lost tasks from a previous job array for example. Command line options override options in the script, so those can be left unchanged.

Limiting the number of tasks that run at once

To throttle a job array by keeping only a certain number of tasks active at a time use the %N suffix where N is the number of active tasks. For example

#SBATCH -a 1-200%5

will produce a 200 task job array with only 5 tasks active at any given time.

Note that while the symbol used is the % sign, this is the actual number of tasks to be submitted at once.

Using scontrol to modify throttling of running array jobs

Reducing the "ArrayTaskThrottle" count on a running job array will not affect the tasks that have already entered the "RUNNING" state. It will only prevent new tasks from starting until the number or running tasks drops below the new lower threshold.

If you want to change the number of simultaneous tasks of an active job, you can use scontrol:

scontrol update ArrayTaskThrottle=<count> JobId=<jobID>

eg

scontrol update ArrayTaskThrottle=50 JobId=12345

Set ArrayTaskThrottle=0 to eliminate any limit.

Naming output and error files

SLURM uses the %A and %a replacement strings for the master job ID and task ID, respectively.

For example:

#SBATCH --output=Array_test.%A_%a.out
#SBATCH --error=Array_test.%A_%a.error

The error log is optional as both types of logs can be written to the 'output' log. Note: if you only use '%A' in the log all array tasks will try to write to a single file. The performance of the run will approach zero asymptotically. Make sure to use both %A and %a in the log file name specification.

#SBATCH --output=Array_test.%A_%a.log