The three most important commands in Slurm are sbatch, srun and scancel. sbatch is used to submit a job script to the queue like the one below, called example.sbatch srun is used to run parallel jobs on compute nodes. Jobs can be canceled with scancel.
#!/bin/bash # #SBATCH -N 1 # Ensure that all cores are on one machine #SBATCH -t 0-01:00 # Runtime in D-HH:MM #SBATCH --cpus-per-task=4 # Request that ncpus be allocated per process. #SBATCH --mem=10g # Memory pool for all cores (see also --mem-per-cpu) # This job requires 4 CPUs (4 CPUs per task). Allocate 4 CPUs from 1 node in the default partition. # Change to the directory that the script was launched from. This is the default for SLURM. module load hisat2/2.1.0 hisat2-build ./reference/22_20-21M.fa 22_20-21M_hisat hisat2 -p $SLURM_CPUS_PER_TASK -x 22_20-21M_hisat -U ./reads/reads_1.fq -S eg1.sam hisat2 -p $SLURM_CPUS_PER_TASK -x 22_20-21M_hisat -1 ./reads/reads_1.fq -2 ./reads/reads_2.fq -S eg2.sam
- NOTE: requests for walltime extensions will not be granted
This is an example job script to run hisat examples. To run this script, copy the hisat example folder as
cp -r /ihome/sam/apps/HISAT/hisat-0.1.6-beta/example. cd example
and generate text file named example.sbatch with the contents like the one above.
This job is submitted with the command sbatch example.sbatch By default the standard out is redirected to slurm-<jobid>.out.
[fangping@login0a example]$ sbatch example.sbatch Submitted batch job 389675 [fangping@login0a example]$ head slurm-389675.out Settings: Output files: "22_20-21M_hisat.*.ht2" Line rate: 6 (line is 64 bytes) Lines per side: 1 (side is 64 bytes) Offset rate: 4 (one in 16) FTable chars: 10 Strings: unpacked Local offset rate: 3 (one in 8) Local fTable chars: 6 Local sequence length: 57344
- Note: By default the working directory of your job is the directory from which the batch script was submitted. See below for more information about job environments.
The sbatch arguments here are the minimal subset required to accurately specify a job on the htc cluster. Please refer to man sbatch for more options.
|-N --nodes||Maximum number of nodes to be used by each Job Step.|
|--tasks-per-node||Specify the number of tasks to be launched per node..|
|--cpus-per-task||Advise the SLURM controller that ensuing job steps will require ncpus number of processors per task.|
|-e --error||File to redirect standard error.|
|-J --job-name||The job name.|
|-t --time||Define the total time required for the job
The format is days-hh:mm:ss.
|--qos||Declare the Quality of Service to be used.
The default is normal.
|--partition||Select the partition to submit the job to.
The only and default partition is htc.
The above arguments can be provided in a batch script by preceding them with #SBATCH. Note that the shebang (#!) line must be present. The shebang line can call any shell or scripting language available on the cluster. For example, #!/bin/bash, #!/bin/tcsh, #!/bin/env python or #!/bin/env perl.
srun also takes the --nodes, --tasks-per-node and --cpus-per-task arguments to allow each job step to change the utilized resources but they cannot exceed those given to sbatch.
Slurm is very explicit in how one requests cores and nodes. While extremely powerful, the three flags, --nodes, --ntasks, and --cpus-per-task can be a bit confusing at first.
--ntasks vs --cpus-per-task
The term “task” in this context can be thought of as a “process”. Therefore, a multi-process program (e.g. MPI) is comprised of multiple tasks. In Slurm, tasks are requested with the --ntasks flag. A multi-threaded program is comprised of a single task, which can in turn use multiple CPUs. CPUs, for the multithreaded programs, are requested with the --cpus-per-task flag. Individual tasks cannot be split across multiple compute nodes, so requesting a number of CPUs with --cpus-per-task flag will always result in all your CPUs allocated on the same compute node.
Example batch scripts and NGS data analysis pipelines
Scripts to perform RNASeq data analysis using HISAT2 + Stringtie are available under /ihome/sam/fangping/example/RNASeq_HISAT2_Stringtie. You can follow the readme file to go through the steps.
Examples of NGS data analysis pipelines are available at NGS Data Analysis. If you need personalized consultation for NGS data analysis workflow and selection of better pipelines, please contact Fangping Mu, PhD.
Submitting multiple Jobs to HTC cluster
Examples to submit multiple Jobs to HTC cluster. Check this link