HTC Cluster

This cluster is designed to run high throughput computing jobs efficiently. HTC cluster is designed to support bioinformatics and health science research.

Here is the table of content: 

  • Access to HTC
    • Off-campus access (setting up the VPN)
    • On-campus access
  • Node configuration
  • Application environment
    • Installed packages
  • Slurm Workload Manager
    • Slurm jobs
    • Service unit
    • Farishare and priority
    • Example batch scripts
    • PBS to Slurm commands
  • CRC wrappers
HTC modules H2P modules
abyss/2.0.2 module load gcc/8.2.0 abyss/2.1.0
admixmap/3.8.3103 module load admixmap/3.8.3103
AFNI/17.2.07 module load afni/18.0.22
ancestrymap/2.0 module load ancestrymap/2.0
annovar/2015Jun17 module load annovar/2018Apr16
ATACSeq_harvard/0.1 module load atacseq_harvard/0.1
augustus/2.5.5 module load augustus/3.3
bamtools/2.4.0 module load gcc/8.2.0 bamtools/2.5.1
bcftools/1.2-gcc5.2.0 remove
bcftools/1.3-gcc5.2.0 (D) module load gcc/8.2.0 bcftools/1.9-40
bcl2fastq/2.19.0 module load bcl2fastq2/2.19.0
bedops/2.4.14 module load bedops/2.4.35
bedtools/2.27.1-gcc5.2.0 2.25.0-gcc5.2.0 2.26.0-gcc5.2.0 module load gcc/8.2.0 bedtools/2.27.1
bioawk/1.0 module load bioawk/1.0
biobambam2/2.0.54 module load biobambam2/2.0.87
biopython/1.68 module load biopython/1.73
biotoolbox/1.44 module load biotoolbox/1.63
bismark/0.18.1 0.14.5 module load bismark/0.20.0
blast+/2.2.31 2.2.26 module load blast+/2.2.31
blast+/2.6.0 module load blast+/2.7.1
bowtie/1.1.2-gcc5.2.0 module load bowtie/1.2.2
bowtie2/2.3.2-gcc5.2.0 2.2.6-gcc5.2.0 module load gcc/8.2.0 bowtie2/2.3.4.2
breseq/0.33.1 0.27.0 module load gcc/8.2.0 breseq/0.33.1
busco/1.1b1 module load busco/3.0.2
bwa/0.7.15-gcc5.2.0 0.7.12-gcc5.2.0 0.6.2 module load gcc/8.2.0 bwa/0.7.17
canu/1.7 module load gcc/8.2.0 canu/1.7.1
CAP-miRSeq/2015-2 delete
CAP3/cap3 module load cap3/cap3
CellRanger/2.1.0 2.0.2 2.0.0 module load cellranger/2.1.0 ; module load cellranger/2.2.0;
CellRanger/3.0.0 module load cellranger/3.0.0
circos/0.68 module load circos/0.69-6
cite-seq-count/1.3.2 module load cite-seq-count/1.2
clcassemblycell/5.0.5 module load clc-assembly-cell/5.1.1
clinker/1.32 module load clinker/1.32
cnvkit/0.8.4 module load cnvkit/0.9.5
cnvnator/0.3.3 module load cnvnator/0.3.3
compiler/gcc/5.2.0 (D) gcc/8.2.0
compiler/java/1.8.0_181-oracle 1.7.0_79-oracle 1.8.0_144-oracle 1.8.0_65-oracle module load java/1.8.0_181-oracle
compiler/perl/5.22.0 module load perl/5.28.0
compiler/python/2.7-bioconda module load python/bioconda-2.7-5.2.0
compiler/python/2.7.13-Anaconda-4.3.0 2.7.10-Anaconda-2.3.0 module load python/anaconda2.7-5.2.0
compiler/python/3.5.2-Anaconda-4.2.0 3.4.3-Anaconda-2.3.0 module load python/bioconda-3.6-5.2.0
compiler/python/3.6.1-Anaconda-4.3.0 module load python/anaconda3.6-5.2.0
cpat/1.2.4 module load cpat/1.2.4
CREST/1.0 old package, required very old version of samtools < 0.1.17, will consider to build docker image if user request this package
cufflinks/2.2.1 module load cufflinks/2.2.1
cutadapt/1.12 1.8.3 module load cutadapt/1.17
cytoscape/3.5.1 2.8.1 module load cytoscape/3.7.0
danpos2/2.2.2 module load danpos2/2.2.2
dapars/0.9.1 module load dapars/0.9.1
deeptools/3.0.1 2.3.4 module load deeptools/3.1.2
defuse/0.8.1 module load defuse/0.8.1
delly/0.7.8 module load delly/0.7.8
detonate/1.10 module load detonate/1.11
diamond/0.7.9 module load diamond/0.9.22
discovardenovo/52488 module load discovardenovo/52488
eagle/2.3.4 module load eagle/2.4.1
EDirect/6.40 module load edirect/10.6
emboss/6.5.7-gcc5.2.0 module load gcc/8.2.0 emboss/6.6.0
EPACTS/3.2.6 module load gcc/8.2.0 epacts/3.3.0
esATAC/1.2.1 module load gcc/8.2.0 r/3.5.1
F-seq/1.85 module load f-seq/1.85
fastq_pair/fastq_pair module load fastq_pair/fastq_pair
fastq_screen/0.11.4 module load fastq_screen/0.13.0
FastQC/0.11.5 0.11.4 module load fastqc/0.11.7
fasttree/2.1.8 module load fasttree/2.1.10
FASTX-Toolkit/0.0.14 0.0.13 module load fastx-toolkit/0.0.14
FlashPCA2/2.0 module load flashpca/2.0
freebayes/1.1.0 module load freebayes/1.2.0
freesurfer/6.0.0 module load freesurfer/6.0.0
fusioncatcher/1.00 0.99.7b 0.99.7c module load fusioncatcher/1.00
FusionInspector/1.1.0 module load fusion-inspector/1.3.1
gatb-core/1.3.0 module load gatb/1.4.1
GATK/3.6-6-ngs.1.3.0 module load gatk/3.6-6-ngs.2.9.2
GATK/3.8.1 3.4-46 3.7.0 module load gatk/3.8.1
GATK/4.0.0.0 (D) module load gatk/4.0.8.1
gdc-tsv-tool/1.0 module load gdc-tsv-tool/2.1
gistic/2.0 module load gistic/2.0
glimmer/3.02 module load glimmer/3.02
gmap_gsnap/2015-09-29-gcc5.2.0 module load gcc/8.2.0 gmap-gsnap/2018-07-04
gnuplot/5.0.4 module load gcc/8.2.0 gnuplot/5.2.4
hifive/1.2.2 module load hifive/1.5.6
HISAT2/2.0.5-64-ngs.1.3.0 module load hisat2/2.1.0-ngs.2.9.2.lua
HISAT2/2.1.0 2.0.3-beta 2.0.5 0.1.6b module load hisat2/2.1.0
HMMER/3.1b2-gcc5.2.0 module load gcc/8.2.0 hmmer/3.2.1
homer/4.9 4.7.2 module load gcc/8.2.0 homer/4.10.3
HTSeq/0.9.0 module load htseq/0.10.0
htslib/1.5 module load gcc/8.2.0 htslib/1.9
IDP-fusion/1.1.1 module load idp-fusion/1.1.1
IDP/0.1.9 module load idp/0.1.9
idr/2.0.3 module load idr/2.0.3
IGV/2.3.65 module load igv/2.4.16
ipyrad/0.7.15 module load ipyrad/0.7.28
jags/4.2.0 module load gcc/8.2.0 jags/4.3.0
jellyfish/2.2.10 module load gcc/8.2.0 jellyfish/2.2.10
joe/4.2 module load joe/4.6
kallisto/0.43.0 module load kallisto/0.44.0
lncFunTK/1.0 module load lncfuntk/1.0
lohhla/2018-02-06 module load lohhla/v20171108
MACS/2.1.1.20160309 2.1.0.20151222 module load macs/2.1.1.20160309
magic-blast/1.2.0 module load magicblast/1.4.0
MapSplice2/2.2.1 module load mapsplice2/2.2.1
MeDUSA/2.1 Removed. outdated MeDIP-seq analysis pipeline.
Meerkat/0.189 module load meerkat/0.189
megahit/1.1.2 module load gcc/8.2.0 megahit/1.1.3
meme/4.12.0 module load meme/5.0.3
merlin/1.1.2 module load merlin/1.1.2
Minimac3/2.0.1 module load minimac4/1.0.0
mira/4.0.2 module load mira/4.0.2
miRDeep2/2.0.0.8 module load mirdeep2/0.1.0
miRge/2.0.6 module load mirge/2.0.6
miso/0.5.3 module load miso/0.5.4
MMseqs2/2.0 module load mmseqs2/5-9375b
multiqc/1.2 module load multiqc/1.6
mummer/4.0.0beta2 module load gcc/8.2.0 mummer/4.0.0beta2
muscle/3.8.1551 module load gcc/8.2.0 muscle/3.8.1551
mutect/1.1.4 mutect 2 is within GATK
MutSig/1.4 module load mutsig/1.41
nextflow/0.27.6 module load nextflow/18.10.1.5003
NextGenMap/0.5.0-gcc5.2.0 module load gcc/8.2.0 nextgenmap/0.5.0
ngmerge/0.3 module load gcc/8.2.0 ngmerge/0.3
ngsplot/2.63 module load ngsplot/2.63
NucleoATAC/0.3.1 module load nucleoatac/0.3.4
oncotator/1.9.2.0 special python package versions are required. Use conda virtual environment
OptiType/1.3.1 module load optitype/1.3.2
paml/4.8 module load paml/4.9h
pandoc/2.2.1 module load pandoc/2.5
PeakDEck/1-1-pl module load peakdeck/1-1-pl
PeakSeq/1.31 module load peakseq/1.31
PeakSplitter/1.0 module load peaksplitter/1.0
phylip/3.697 module load phylip/3.697
picard/2.11.0 1.140 2.6.0 module load picard/2.18.12
pilon/1.22 module load pilon/1.23
plink/1.90-beta-160315 1.90b3x module load plink/1.90b6.7
primer3/2.3.7 module load primer3/2.4.0
prinseq/0.20.4 module load prinseq/0.20.4
proovread/2.14.1 module load proovread/2.14.1
provirus/0.3-devel 0.3 module load provirus/0.3-devel
pybedtools/0.7.10 module load pybedtools/0.7.10
qapa/1.1.1 module load qapa/1.2.1
qiime2/2018.6 2017.11 module load qiime2/2018.8
QoRTs/1.3.0 1.1.8 module load gcc/8.2.0 qorts/1.3.0
QuickMIRSeq/2016 module load quickmirseq/2016
R/3.5.1-gcc5.2.0 3.5.0-gcc5.2.0 3.4.1-gcc5.2.0 3.3.1-gcc5.2.0 3.2.2-gcc5.2.0 3.1.1-gcc5.2.0 3.0.0-gcc5.2.0 2.15.3-gcc5.2.0 module load gcc/8.2.0 r/3.5.1
racon/1.3.1 module load racon/1.3.1
RAPSearch2/2.24 module load rapsearch/2.24
rgt/0.11.4 module load rgt/0.11.4
rna-seqc/1.1.8 module load rna-seqc/1.1.8
RnaChipIntegrator/1.1.0 module load rnachipintegrator/1.1.0
RNACocktail/0.2.1 module load rnacocktail/0.2.2 installed as singularity image at /ihome/crc/install/rnacocktail/rnacocktail.simg
rnaseqmut/0.7 module load rnaseqmut/0.7
rosetta/3.9 module load rosetta/3.9
RSEM/1.3.0 1.2.23 module load gcc/8.2.0 rsem/1.3.1
RSeQC/2.6.4 module load rseqc/2.6.5
RTG/3.5.2 module load rtg-core/3.10
sailfish/0.10.0 0.7.6 overcome by salmon
salmon/0.11.2 0.6.0 0.8.2 0.9.1 module load salmon/0.11.3
samblaster/0.1.24 module load gcc/8.2.0 samblaster/0.1.24
samtools/0.1.19-gcc5.2.0 module load gcc/8.2.0 samtools/0.1.19
samtools/1.7-gcc5.2.0 1.1-gcc5.2.0 1.2-gcc5.2.0 1.3.1-gcc5.2.0 1.5-gcc5.2.0 module load gcc/8.2.0 samtools/1.9
scanpy/1.3.2 module load scanpy/1.3.6
seqtk/1.2-r94 module load seqtk/1.3
SNAP/1.0beta.18-gcc5.2.0 module load snap-aligner/1.0beta.23
snpeff/4.3s module load snpeff/4.3t
snptest/2.5.2 module load snptest/2.5.2
SNVer/0.5.3 module load snver/0.5.3
SNVMix2/0.11.8-r5 module load snvmix2/0.11.8-r5
SOAPindel/2.1.7.17 module load soapindel/2.1.7.17
solar/8.1.4 module load solar/8.1.4
spades/3.11.0 module load spades/3.12.0
squid/1.4 module load squid/1.5
SRAToolkit/2.9.1 2.8.0 2.8.2 module load sra-toolkit/2.9.2
sRNAnalyzer/2017 module load srnanalyzer/2017
STAR-Fusion/1.4.0 1.3.2 1.1.0 0.8.0 module load star-fusion/1.5.0
STAR/2.6.0c 2.4.2a 2.5.2a 2.5.2b 2.5.4b 2.5.4b_gcc5.2.0 module load star/2.6.1a
strelka/4.10.2 module load strelka/2.9.8
StringTie/1.3.2c 1.3.1c module load stringtie/1.3.4d
subread/1.6.2 1.5.0-p2 module load gcc/8.2.0 subread/1.6.2
svaba/0.2.1 module load gcc/8.2.0 svaba/0.2.1
SVDetect/0.8b module load svdetect/0.8b
svviz/2.0a3 1.6.2 module load svviz2/2.0a3
tophat/2.1.1 2.1.0 module load tophat/2.1.1
TransDecoder/5.3.0 2.0.1 module load gcc/8.2.0 transdecoder/5.3.0
TrimGalore/0.4.5 module load trimgalore/0.5.0
Trimmomatic/0.33 module load trimmomatic/0.38
trinity/2.4.0 2.1.1 module load gcc/8.2.0 trinity/2.8.3
Trinotate/2.0.2 Removed. File locking problems on cluster
ucsc/kentUtils/v331 module load kentutils/v370
UMI-tools/0.5.3 module load umi-tools/0.5.4
unceqr/0.2 module load unceqr/0.2
VarDictJava/1.4.10 1.4.5 module load module load vardictjava/1.5.8
VarScan/2.4.2 module load varscan/2.4.2
vcf2maf/1.6.12 module load vcf2maf/1.6.16
vcftools/0.1.14-gcc5.2.0 module load gcc/8.2.0 vcftools/0.1.16
velvet/1.2.10-gcc5.2.0 module load gcc/8.2.0 velvet/1.2.10
VEP/88 module load gcc/8.2.0 vep/93
vfs/2016-08-17.r2 module load vfs/2016-08-17.r2
Virus_Detection_SRA/0.2 module load virus_detection_sra/0.2
virusfinder/2.0 module load virusfinder/2.0
VirVarSeq/2015-04-28 module load virvarseq/2015-04-28
weaver/0.21 https://github.com/ma-compbio/Weaver
XHMM/2016-01-04 module load xhmm/2016-01-04

Access to HTC

CRC computational resources is housed off-campus at the University’s main datacenter at RIDC Park. CRC clusters is firewalled, so you can not directly access it when you are off-campus. You should first pass the firewall using the VPN service, and then try to connect to CRC clusters.

Off-campus access

If you are off-campus, the cluster is accessible securely from any where in the world via Virtual Private Networking (VPN), a service of CSSD. VPN requires certain software to run on your system, and multiple alternatives are available in order to cover almost all systems and configurations.

Windows and Mac

Download/Install Pulse VPN and follow the instruction as follows:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Linux

VPNC is a commandline VPN application which may be the most convenient for some Linux users.

Most distributions provide prebuild binaries, or you can get the source and install your own:

Once installed, download the configuration file here (requires login) and move the file to /etc/vpnc/pitt.conf. Then:

  • Adding your username and password and delete YourPittUsername_HERE and YourPittPassword_HERE.
  • Run sudo vpnc pitt, to stop, run sudo vpnc-disconnect
    • Disconnect will kill most recent vpnc
    • Kill all of them with sudo killall vpnc

On-campus access

To use CRC resources, users must first have a valid Pitt ID, and then formally request an account. Once you have valid login credentials, the clusters can be accessed via SSH. For example to connect to H2P:

$ ssh pittID@htc.crc.pitt.edu

Your username is your PittID and your password is the same as your campus-wide Pitt password.

To check whether you can use the cluster, use sacctmgr list user PittID

[fangping@login0a example]$ sacctmgr list user fangping
      User   Def Acct     Admin 
---------- ---------- --------- 
  fangping        sam      None 

 

If you do not see your ID listed, you are not granted usage of this cluster. If you believe that you should grant access, please submit a ticket.

Windows

Download/Install Xming or Putty (use Windows MSI installer package). Putty setup instruction provided in the following snapshots.

Mac and Linux

Open your favorite terminal emulator

Node configuration

There are 20 compute nodes in total with the following configuration:

  • 16 E5-2660 v3 (Haswell) nodes
    • 2.40GHz, 16 cores
    • 256 GB RAM 2133 MHz
    • 256 GB SSD
    • 56 Gb/s FDR InfiniBand
  • 4 E5-2643v4 (Broadwell) nodes
    • 3.40 GHz, 16 cores
    • 256 GB RAM
    • 256 GB SSD
    • 56 Gb/s FDR InfiniBand
  • 4 Xeon Gold 6126 (Skylake) nodes
    • 2.60 GHz, 24-core
    • 377 GB RAM
    • 256 GB SSD & 500 GB SSD
    • 56 Gb/s FDR InfiniBand

There are two login nodes that can be used for compilation.

  • E5-2620 v3 (Haswell)
  • 2.40GHz, 12 cores (24 hyperthreads)
  • 64 GB 1867 MHz
  • 56 Gb/s FDR Infiniband

For performance reasons the following configuration has been chosen for compute nodes and login nodes.

  • RedHat Enterprise 6.6

Filesystems

All nodes in the HTC mount the following file severs.

It is important to note the $HOME directories are shared with other clusters and configuration files may not be compatible. Please check through your .bashrc, .bash_profile and all other dotfiles if you encounter problems.

Filesystem Mount
ihome, backup /ihome
Mobydisk, not backup /mnt/mobydisk
ZFS, not backup /zfs1
Scratch (compute only) /scratch
Filesystem Mount
ihome /ihome
MobyDisk /mnt/mobydisk
Home 0 /home
Home 1 /home1
Home 2 /home2
Gscratch2 /gscratch2
Scratch (compute only) /scratch

Compilers

GNU 4.4.7 compilers are available in your path when you login. Newer GNU 5.2.0 compilers are available as module environments.

Currently, HTC cluster does not support distributed parallel MPI jobs. Only shared memory parallel jobs are supported.

Compilers

GNU compilers are available in your path when you login. Newer GNU compilers are available as module environments.

Compiler Version executable name AVX2 support
GNU C 5.2.0* gcc Yes
GNU C++ 5.2.0* g++ Yes
GNU Fortran 5.2.0* gfortran Yes
---- ---- ---- ----
GNU C 4.4.7 gcc No
GNU C++ 4.4.7 g++ No
GNU Fortran 4.4.7 gfortran No

See the man pages man <executable> for more information about flags.

  • GCC 5.2.0 is available through the Lmod Application Environment. See below.
  • Currently, HTC cluster does not support distributed parallel MPI jobs. Only shared memory parallel jobs are supported.

Instruction sets

The Haswell CPUs support AVX2 instructions. The GCC 5.2.0 compiler support AVX2 with the -march=core-avx2 flag. You may also need binutils 2.25 to utilize these instructions and to ensure that your executable and libraries are optimized. The login nodes have the same architecture as the compute nodes.

Application environment

Lmod will be used by cluster administrators to provide optimized builds of commonly used software. Applications be available to users through the Lmod modular environment commands. There are no default modules loaded when you log in.

Installed packages

Use the command module avail to list all installed applications. The architecture for the HTC Cluster is called haswell, which means that codes have been compiled to utilize the AVX2 instruction set as best as possible.

[fangping@login0a ~]$ module avail
 
--------------------------------- /ihome/sam/apps/modulefiles/Linux ----------------------------------
   CAP-miRSeq/2015-2                breseq/0.27.0
   CAP3/cap3                        busco/1.1b1
   CREST/1.0                        bwa/0.6.2
   EDirect/6.40                     bwa/0.7.12-gcc5.2.0                   (D)
   EPACTS/3.2.6                     circos/0.68
   FastQC/0.11.4                    clcassemblycell/5.0.3
   FastQC/0.11.5             (D)    cnvkit/0.8.4
   GATK/3.4-46                      compiler/gcc/5.2.0
   GATK/3.6-6-ngs.1.3.0      (D)    compiler/java/1.6.0_45-oracle
   GeneTorrent/3.8.7                compiler/java/1.7.0_79-oracle
   HISAT/0.1.6b                     compiler/java/1.8.0_65-oracle         (D)
   HISAT2/2.0.3-beta                compiler/perl/5.22.0
   HISAT2/2.0.5                     compiler/python/2.7.10-Anaconda-2.3.0 (D)
   HISAT2/2.0.5-64-ngs.1.3.0 (D)    compiler/python/3.4.3-Anaconda-2.3.0
   HMMER/3.1b2-gcc5.2.0             cufflinks/2.2.1
   HTSeq/0.6.1p1                    cutadapt/1.8.3
   IGV/2.3.65                       cutadapt/1.12                         (D)
   MACS/2.1.0.20151222              deeptools/2.2.2
   Meerkat/0.189                    detonate/1.10
   MutSig/1.4                       diamond/0.7.9
   PeakSeq/1.31                     emboss/6.5.7-gcc5.2.0
   PeakSplitter/1.0                 fasttree/2.1.8
   PetaSuite/1.0.4                  fusioncatcher/0.99.7b
   R/3.2.2-gcc5.2.0          (D)    genomics/phoenix
   R/3.3.1-gcc5.2.0                 gmap_gsnap/2015-09-29-gcc5.2.0
   RAPSearch2/2.24                  gnuplot/5.0.4
   RSEM/1.2.23                      gsl/2.2
   RSEM/1.3.0                (D)    hifive/1.2.2
   RStudio/0.98                     homer/4.7.2
   RTG/3.5.2                        jags/4.2.0
   SNAP/1.0beta.18-gcc5.2.0         joe/4.2
   SNVMix2/0.11.8-r5                kallisto/0.43.0
   SNVer/0.5.3                      magic-blast/1.2.0
   SOAPindel/2.1.7.17               merlin/1.1.2
   SRAToolkit/2.8.0                 mira/4.0.2
   SRAToolkit/2.8.2          (D)    miso/0.5.3
   STAR/2.4.2a                      mutect/1.1.4
   STAR/2.5.2a                      paml/4.8
   STAR/2.5.2b               (D)    picard-tools/1.140
   SVDetect/0.8b                    plink/1.90b3x
   StringTie/1.3.1c                 plink/1.90-beta-160315                (D)
   StringTie/1.3.2c          (D)    primer3/2.3.7
   TransDecoder/2.0.1               prinseq/0.20.4
   TrimGalore/0.4.2                 rna-seqc/1.1.8
   Trimmomatic/0.33                 rnaseqmut/0.7
   Trinotate/2.0.2                  sailfish/0.7.6
   VarDictJava/1.4.5                sailfish/0.10.0                       (D)
   VarDictJava/1.4.10        (D)    salmon/0.6.0
   VarScan/2.4.2                    samtools/0.1.19-gcc5.2.0
   XHMM/2016-01-04                  samtools/1.1-gcc5.2.0
   _runtime/gcc-5.2.0               samtools/1.2-gcc5.2.0
   admixmap/3.8.3103                samtools/1.3.1-gcc5.2.0               (D)
   ancestrymap/2.0                  seqtk/1.2-r94
   annovar/2015Jun17                snptest/2.5.2
   bamtools/2.4.0                   solar/8.1.4
   bcbio-nextgen/1.0.2              spark/1.6.1
   bcftools/1.2-gcc5.2.0            strelka/4.10.2
   bedops/2.4.14                    stubl/0.0.9
   bedtools/2.25.0-gcc5.2.0         subread/1.5.0-p2
   binutils/2.25-gcc5.2.0           tensorflow/1.0.0-rc1
   bismark/0.14.5                   tophat/2.1.0
   blast+/2.2.26                    trinity/2.1.1
   blast+/2.2.31             (D)    ucsc/kentUtils/v331
   blast+/2.6.0                     unceqr/0.2
   boost/1.59.0-gcc5.2.0            vcftools/0.1.14-gcc5.2.0
   bowtie/1.1.2-gcc5.2.0            velvet/1.2.10-gcc5.2.0
   bowtie2/2.2.6-gcc5.2.0           zlib/1.2.8-gcc5.2.0
   bowtie2/2.3.0             (D)

Module environment files have been created for each of these packages and can be easily loaded into your shell with module load <packagename>.

In the example below I have loaded the HISAT package into my environment. The executables, such as hisat2, hisat2-build is now in my PATH.

[fangping@login0a ~]$ module load HISAT2/2.0.5
[fangping@login0a ~]$ which hisat2
/ihome/sam/apps/HISAT2/hisat2-2.0.5/hisat2
[fangping@login0a ~]$ which hisat2-build
/ihome/sam/apps/HISAT2/hisat2-2.0.5/hisat2-build

You can check which modules are “loaded” in your environment by using the command module list

[fangping@login0a ~]$ module list
 
Currently Loaded Modules:
  1) HISAT2/2.0.5

To unload or remove a module, just use the unload option with the module command, but you have to specify the complete name of the environment module:

[fangping@login0a ~]$ module unload HISAT2/2.0.5
[fangping@login0a ~]$ module list
No modules loaded

Alternatively, you can unload all loaded environment modules using module purge.

  • Several reference genome data are available at /mnt/mobydisk/pan/genomics/refs

Slurm Workload Manager

The HTC cluster uses Slurm for batch job queuing. 16 compute nodes belong to the htc partition and it is the default partition. The sinfo command provides an overview of the state of the nodes within the cluster.

[fangping@login0a ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
htc*         up 6-00:00:00      4    mix n[410,413,417,427]
htc*         up 6-00:00:00     16  alloc n[409,411-412,414-416,418-426,428]

Nodes in the alloc state mean that a job is running. The asterisk next to the htc partition means that it is the default partition for all jobs.

squeue shows the list of running and queued jobs.

The most common states for jobs in squeue are described below. See man squeue for more details.

Abbreviation State Description
CA CANCELLED Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated.
CD COMPLETED Job has terminated all processes on all nodes.
CG COMPLETING Job is in the process of completing. Some processes on some nodes may still be active.
F FAILED Job terminated with non-zero exit code or other failure condition.
PD PENDING Job is awaiting resource allocation.
R RUNNING Job currently has an allocation.
TO TIMEOUT Job terminated upon reaching its time limit.

See man squeue for a complete description the possible REASONS for pending jobs.

To see when all jobs are expected to start run squeue --start.

The scontrol output shows detailed job output.

scontrol show job <jobid>
  • Note: not all jobs have a definite start time.

Slurm jobs

The three most important commands in Slurm are sbatch, srun and scancel. sbatch is used to submit a job script to the queue like the one below, called example.sbatch srun is used to run parallel jobs on compute nodes. Jobs can be canceled with scancel.

#!/bin/bash
#
#SBATCH -N 1 # Ensure that all cores are on one machine
#SBATCH -t 0-01:00 # Runtime in D-HH:MM
 
#SBATCH --cpus-per-task=4 # Request that ncpus be allocated per process.
#SBATCH --mem=10g # Memory pool for all cores (see also --mem-per-cpu)
 
# This job requires 4 CPUs (4 CPUs per task). Allocate 4 CPUs from 1 node in the default partition.
 
# Change to the directory that the script was launched from. This is the default for SLURM.
 
module load HISAT2/2.0.5
 
hisat2-build ./reference/22_20-21M.fa 22_20-21M_hisat
hisat2 -p $SLURM_CPUS_PER_TASK -x 22_20-21M_hisat -U ./reads/reads_1.fq -S eg1.sam
hisat2 -p $SLURM_CPUS_PER_TASK -x 22_20-21M_hisat -1 ./reads/reads_1.fq -2 ./reads/reads_2.fq -S eg2.sam
  • NOTE: requests for walltime extensions will not be granted

This is an example job script to run hisat examples. To run this script, copy the hisat example folder as cp -r /ihome/sam/apps/HISAT/hisat-0.1.6-beta/example .; cd example, and generate text file named example.sbatch with the contents like the one above. This job is submitted with the command sbatch example.sbatch By default the standard out is redirected to slurm-<jobid>.out.

[fangping@login0a example]$ sbatch example.sbatch
Submitted batch job 389675
[fangping@login0a example]$ head slurm-389675.out
Settings:
  Output files: "22_20-21M_hisat.*.ht2"
  Line rate: 6 (line is 64 bytes)
  Lines per side: 1 (side is 64 bytes)
  Offset rate: 4 (one in 16)
  FTable chars: 10
  Strings: unpacked
  Local offset rate: 3 (one in 8)
  Local fTable chars: 6
  Local sequence length: 57344
  • Note: By default the working directory of your job is the directory from which the batch script was submitted. See below for more information about job environments.

The sbatch arguments here are the minimal subset required to accurately specify a job on the htc cluster. Please refer to man sbatch for more options.

sbatch argument Description
-N --nodes Maximum number of nodes to be used by each Job Step.
--tasks-per-node Specify the number of tasks to be launched per node..
--cpus-per-task Advise the SLURM controller that ensuing job steps will require ncpus number of processors per task.
-e --error File to redirect standard error.
-J --job-name The job name.
-t --time Define the total time required for the job
The format is days-hh:mm:ss.
--qos Declare the Quality of Service to be used.
The default is normal.
--partition Select the partition to submit the job to.
The only and default partition is htc.

The above arguments can be provided in a batch script by preceding them with #SBATCH. Note that the shebang (#!) line must be present. The shebang line can call any shell or scripting language available on the cluster. For example, #!/bin/bash, #!/bin/tcsh, #!/bin/env python or #!/bin/env perl.

srun also takes the --nodes, --tasks-per-node and --cpus-per-task arguments to allow each job step to change the utilized resources but they cannot exceed those given to sbatch.

Slurm is very explicit in how one requests cores and nodes. While extremely powerful, the three flags, --nodes, --ntasks, and --cpus-per-task can be a bit confusing at first.

--ntasks vs --cpus-per-task

The term “task” in this context can be thought of as a “process”. Therefore, a multi-process program (e.g. MPI) is comprised of multiple tasks. In Slurm, tasks are requested with the --ntasks flag. A multi-threaded program is comprised of a single task, which can in turn use multiple CPUs. CPUs, for the multithreaded programs, are requested with the --cpus-per-task flag. Individual tasks cannot be split across multiple compute nodes, so requesting a number of CPUs with --cpus-per-task flag will always result in all your CPUs allocated on the same compute node.

Example batch scripts and NGS data analysis pipelines

Scripts to perform RNASeq data analysis using HISAT2 + Stringtie are available under /ihome/sam/fangping/example/RNASeq_HISAT2_Stringtie. You can follow the readme file to go through the steps.

Examples of NGS data analysis pipelines are available at NGS Data Analysis. If you need personalized consultation for NGS data analysis workflow and selection of better pipelines, please contact me (fangping@pitt.edu).

Submitting multiple Jobs to HTC cluster

Examples to submit multiple Jobs to HTC cluster

PBS to Slurm commands

PBS Torque and SLURM scripts are two frameworks for specifying the resource requirements and settings for the job you want to run. Frank used PBS Torque for specifying the resource requirement. For the most part, there are equivalent settings in each script. The following table lists examples of equivalent options for PBS and SLURM job scripts.

Command PBS/Torque Slurm
Job submission qsub -q <queue> -l nodes=1:ppn=16 -l mem=64g <job script> sbatch -p <queue> -N 1 -c 16 --mem=64g <job script>
Job submission qsub <job script> sbatch <job script>
Node count -l nodes=<count> -N <min[-max]>
Cores per node -l ppn=<count> -c <count>
Memory size -l mem=16384 --mem=16g
Wall clock limit -l walltime=<hh:mm:ss> -t <days-hh:mm:ss>
Job name -N <name> --job-name=<name>

A complete comparison of PBS Torque and SLURM script commands is available here.

Interactive jobs

To submit an interactive job, you should initiate with the srun command instead of sbatch. This command:

srun -n1 -t02:00:00 --pty bash

will start an interactive job. When the interactive job starts, you will notice that you are no longer on a login node, but rather one of the compute nodes.

[fangping@login0a ~]$ srun -n1 -t02:00:00 --pty bash
[fangping@n409 ~]$

This will give you 1 core for 2 hours.

Interactive jobs with x11 forwarding

If you would like to run application that have a GUI interface and for those cases X11 is required, you must pass an authenticated X11 session for the login node to your interactive session on a compute node. You can follow the following steps:

Login from Linux or a Mac terminal:

ssh -X htc.crc.pitt.edu

Then initiate an interactive session with --x11 options.

srun -n1 -t02:00:00 --x11=first --pty bash

This will initiate an X11 tunnel to the first node on your list. –-x11 has additional options of batch, first, last, and all.

Once in your interactive session you can launch software that has a GUI from the command line. For example, you can run rstudio as,

[fangping@login0a ~]$ srun -n1 -t02:00:00 --x11=first --pty bash
[fangping@n409 ~]$ module load RStudio/0.98
[fangping@n409 ~]$ module load R/3.2.2-gcc5.2.0
[fangping@n409 ~]$ rstudio

Quality of Service

All jobs submitted to Slurm must be assigned a Quality of Service (QoS). QoS levels define resource limitations. The default QoS is normal.

Quality of Service Max Walltime Priority factor
short 12:00:00 1.0
normal 3-00:00:00 0.75
long 6-00:00:00 0.5
  • Walltime is specified in days-hh:mm:ss

If your job does not meet these requirements it will be not be accepted.

Job priorities

Jobs on the htc cluster are executed in order of priority. The priority function has four components Age, FairShare, QoS and JobSize. Each component has a value between 0 and 1 and each are weighted separately in the total job priority. Only the Age factor increases as the job waits.

  • NOTE: The priority weights are intended to favor jobs that use more nodes for shorter wall times.
Priority factor Description Weight
Age Total time queued.
Factor reaches 1 at 14 days.
2000
QoS Priority factor from QoS levels above. 2000
JobSize Factor approaches 1 as more nodes are requested 4000
FairShare FairShare factor described below 2000
  • The maximum priority value is 10000 for any job.

Backfill

Even though jobs are expected to run in order of decreasing priority, backfill allows jobs with lower priority to fit in the gaps. A job will be allowed to run through backfill if it's execution does not delay the start of higher priority jobs. To use backfill effectively users are encouraged to submit jobs with as short a walltime as possible.

Fairshare policies

FairShare has been enabled which adjusts priorities for jobs based on historical usage of the cluster. The FairShare priority factor is explained on the Slurm website.

To see the current FairShare priority factor run sshare. Several options are available, please refer to man sshare for more details.

The FairShare factors for all users is listed with sshare -a.

On the HTC cluster all users are given equal shares. We may change this policy based on the usage of HTC clusters.

Local Scratch directory

Each node in the HTC Cluster has a single scratch disk for temporary data generated by the job. Local scratch directories are created on each node in the following location at the start of job or allocation.

/scratch/slurm-$SLURM_JOB_ID

The $SLURM_SCRATCH environment variable is then set in the job's environment to the above scratch directory.

  • The $SLURM_SCRATCH directories are removed from each node at the completion of the job

To copy files to the $SLURM_SCRATCH scratch disk on the master compute node just use cp or rsync. Remember, the initial working directory for the job is the directory from which the job was submitted. To allow srun to run the job from the $SLURM_SCRATCH scratch directory add --chdir.