Skip to content

Using Slurm to Submit Jobs

On this page

The table of contents requires JavaScript to load.

#SBATCH Directives

To create a batch script, use your favorite text editor to create a text file that details job requirements and instructions on how to run your job. 

All job requirements passed to Slurm are prefaced by #SBATCH directives. The #SBATCH commands are used to pass computational requirements of your job to Slurm, which Slurm uses to determine what resources to give to your job. 

The different #SBATCH directives for your slurm job are:

#SBATCH --account=<youraccount>

Specifies the allocation account (often tied to a research group or class) that will be charged for the resources consumed by this job.

If your group has owner nodes, the account is usually <unix_group>-<cluster_abbreviation>  (where cluster abbreviation is  np, kp, lp, rw, ash).

There are other types of accounts, typically named for specific partitions. These can include owner-guest, <cluster>-gpu, notchpeak-shared-short, and smithp-guest.

#SBATCH --partition=<yourpartition>

Designates the cluster queue or partition where the job will run (e.g., lonepeak is a specific set of nodes/resources).

Naming mechanisms include cluster, cluster-shared, cluster-gpu, cluster-gpu-guest, cluster-guest, cluster-shared-guest, and pi-cl, where cluster is the full name of the cluster and cl is the abbreviated form.

We have our partition names described here.

 

  Try using our tool that describes accounts and partitions for help finding accounts, partitions, and qualities of service you can use when submitting jobs on Center for High Performance Computing systems.

 

#SBATCH --time=DD-HH:MM:SS

Sets the maximum wall-clock time the job is allowed to run. If the job exceeds this limit it will be automatically terminated.

DD - Days, HH - Hours, MM - Minutes, SS - Seconds

*Note*There is a walltime limit of 72 hours for jobs on general cluster nodes and 14 days on owner cluster nodes. If your job requires more time than these hard limits, you can email the CHPC at helpdesk@chpc.utah.edu, providing the job ID, cluster, and length of time you would like to extend the job to.

#SBATCH --ntasks=<number-of-cpus>

Requests the total number of tasks (processes) the job will use. This is commonly used for parallel jobs and is often related to the number of CPUs requested.

#SBATCH --mem=<size>[units]

Requests the total amount of physical memory (RAM) required for the entire job.

 

#SBATCH --nodes=<number-of-nodes>

This directive specifies the minimum and maximum number of compute nodes required for the job. It tells the scheduler to allocate resources across the specified number of individual physical computers in the cluster.

 

#SBATCH -o slurmjob-%j.out-%N

Specifies the file path for the job's standard output (stdout). The %j and %N variables are replaced with the job ID and the name of the first node used, respectively.

#SBATCH -e slurmjob-%j.err-%N

Specifies the file path for the job's standard error (stderr). Any error messages will be directed to this file, using the job ID and node name for clarity.

 

Find out which Slurm Accounts you are in

The easiest method to find the accounts and partitions you have access to at the CHPC is to use the mychpc batch command. This command will output the cluster, the applicable account and partition for that cluster, and your allocation status for that partition.

An example would look like the below:

GENERAL
CPU --partition=kingspeak-shared --qos=kingspeak --account=baggins [21% idle] 

The above shows a general (i.e. non-preemptable) allocation on the kingspeak cluster under the baggins account within the kingspeak-shared partition. It also indicates how much of the partition is available without a wait - in this example, 21% of the CPUs within the kingspeak-shared partition are idle and available for jobs.

If you notice anything incorrect in the output from the mychpc batch command that you feel should be changed, please let us know.

Where to Run Your Slurm Job

There are three main places you can run your job: your home directory, /scratch spaces, or group spaces (available if your group has purchased group storage). This will determine where I/O is handled during the duration of your job. Each has its own benefits, outlined below:

Home Scratch  Group Space
Free Free $150/TB without backups
Automatically provisioned per user 60 day automatic deletion of untouched files $450/TB with backups
50 GB soft limit Two files systems: vast and nfs1 Is shared among your group

Due to the memory limits in each users home directory, we recommend setting up your jobs to run in our scratch file systems. It must be noted that files in the CHPC's scratch file systems will be deleted if untouched for 60 days. 

To run jobs in the CHPC scratch file systems (vast or nfs1), place the following commands in your Slurm batch script. The commands that you use depend on what Linux shell you have.

Unsure? Type ' echo $SHELL ' in your terminal.

Bash


SCRDIR=/scratch/general/<file-system>/$USER/$SLURM_JOB_ID

mkdir -p $SCRDIR

cp <input-files> $SCRDIR

cd $SCRDIR

TCSH


set SCRDIR = /scratch/general/<file-system>/$USER/$SLURM_JOB_ID

mkdir -p $SCRDIR

cp <input-files> $SCRDIR

cd $SCRDIR

  • Replace <file-system> with either vast or nfs1.
  • $USER points to your uNID and $SLURM_JOB_ID points to the job ID that Slurm assigned your job.

Putting it all Together: An Example Slurm Script

Below is an example job that combines all of the information from above. In this example below, we will suppose your PI is Frodo Baggins (group ID baggins) and is requesting general user access to 1 lonepeak node with at least 8 cpus and 32GB of memory. The job will run for two hours.

#!/bin/bash
#SBATCH --account=baggins
#SBATCH --partition=lonepeak
#SBATCH --time=02:00:00
#SBATCH --ntasks=8
#SBATCH --mem=32G
#SBATCH -o slurmjob-%j.out-%N
#SBATCH -e slurmjob-%j.err-%N

#set up scratch directory
SCRDIR=/scratch/general/vast/$USER/$SLURM_JOB_ID
mkdir -p $SCRDIR

#copy input files and move over to the scratch directory
cp inputfile.csv myscript.r $SCRDIR
cd $SCRDIR

#load your module
module load R/4.4.0

#run your script
Rscript myscript.r inputfile.csv

#copy output to your home directory and clean up
cp outputfile.csv $HOME
cd $HOME
rm -rf $SCRDIR

 

NOTE When specifying an account or paritition, you may use either an equals sign or a space before the account or parition name, but you may not use both in the same line. For example, "#SBATCH --account=kingspeak-gpu" and "#SBATCH --account kingspeak-gpu" are acceptable, but "#SBATCH --account = kingspeak-gpu" is not.

For more examples of SLURM jobs scripts see CHPC MyJobs templates.

Submitting your Job to Slurm

In order to submit a job, one has to be logged onto the CHPC systems. Once logged on, job submission is done with the sbatch command in slurm.

For example, to submit a script named SlurmScript.sh, type:

sbatch SlurmScript.sh

 

NOTE: sbatch by default passes all environment variables to the compute node, which differs from the behavior in PBS (which started with a clean shell). If you need to start with a clean environment, you will need to use the following directive in your batch script:

  • #SBATCH --export=NONE

This will still execute .bashrc/.tcshrc scripts, but any changes you make in your interactive environment will not be present in the compute session. As an additional precaution, if you are using modules, you should use  module purge to guarantee a fresh environment.

Checking the Status of your Job

To check the status of your job, use the squeue command. The output from the squeue command on its own will output all jobs currently submitted to the cluster you are logged onto. You can filter the output of squeue to jobs that only pertain to you in a number of ways:

squeue --me

squeue -u uNID

squeue -j job#

 

Adding -l (for "long" output) gives more details in the squeue output.

Slurm Job Arrays

Slurm arrays enable quick submission of many related jobs. In this case, Slurm provides an environment variable, SLURM_ARRAY_TASK_ID, which differentiates Slurm jobs with an array by a given index number.

For example, if we need to run the same program against 30 different samples, we can utilize Slurm arrays to run the program across the 30 different samples with a naming convention such as sample_[1-30].data using the following script:

#!/bin/bash
#SBATCH -n 1 # Number of tasks
#SBATCH -N 1 # All tasks on one machine
#SBATCH -p PARTITION # Partition on some cluster
#SBATCH -A ACCOUNT # The account associated with the above partition
#SBATCH -t 02:00:00 # 2 hours (D-HH:MM)
#SBATCH -o myprog%A%a.out # Standard output
#SBATCH -e myprog%A%a.err # Standard error
#SBATCH --array=1-30

./myprogram input_$SLURM_ARRAY_TASK_ID.data

You can also limit the number of jobs that can be running simultaneously to "n" by adding a %n after the end of the array range:

#SBATCH --array=1-30%5

Apart from $SLURM_ARRAY_TASK_ID, Slurm also utilizes a few environmental variables to represent various variables important to Slurm arrays. These include:

  • %A and %a, which represent the job ID and the job array index, respectively. These can be used in the #SBATCH parameters to generate unique names.
  • SLURM_ARRAY_TASK_COUNT is the number of arrays.
  • SLURM_ARRAY_TASK_MAX is the highest job array index value.
  • SLURM_ARRAY_TASK_MIN is the lowest job array index value.

When submitting jobs that use less than the full CPU count per node, use the shared partitions to allow multiple array jobs on one node. For more information, see the Node Sharing page. 

Depending on the characteristics of your job, there may be a number of other solutions you could use, detailed on the running multiple serial jobs page.

Last Updated: 12/11/25