Skip to main content

CRCD Upgrading OS from RHEL7 to RHEL9

Since CRCD began over seven years ago, we have worked within the Red Hat Enterprise Linux (RHEL) version 7 as our base Operating System (OS) for the login and compute nodes. RHEL 7 is no longer meeting our needs​. Among other issues, regular security updates are needed but they entail additional cost​; and the latest software used by Pitt researchers requires updated libraries​. 

We will be upgrading to RHEL 9. This upgrade will be disruptive in that the software stack is being rebuilt in the new OS, and existing Slurm jobs will need to be resubmitted with adjusted submission scripts. To mitigate the impact of this, a staging environment has been created for CRCD team members and users to test job submission scripts in the new OS.

RHEL9 Staging Environment Specifications

Cluster Partition Nodes Architecture Compute/Node Memory
rh9 cpu 1 AMD EPYC 9575F 128 Cores ~1.5 T
  mpi 1 AMD EPYC 9575F 128 Cores ~1.5 T
  gpu 2 L40s 4 GPUs ~48GB per GPU

Access Methods

Access Method 1: Command-line ssh access

  • The access point for using the rh9 cluster is the following host: login2.crc.pitt.edu
  • This login node can be accessed in two steps:
1. ssh into H2P or HTC like normal:

user@my-local-pc:~$ ssh user@h2p.crc.pitt.edu
user@h2p.crc.pitt.edu's password: 
...

2. ssh from there into login2:

[user@login1 ~] : ssh login2.crc.pitt.edu
...

Access Method 2: Open OnDemand

  • Portal URL: seed.crc.pitt.edu.
  • As usual, you will need to be connected to PittNet via the Globalprotect VPN first to access it. 
  • Updating some of the functionality is ongoing, please report if anything is missing.

Checking Software Availability

Try searching for relevant modules by name from the command line first, outside the context of a job submission.

For a broad overview of available packages, use module avail

[user@login2 ~] : module avail

---------------------------------------------------- /software/rhel9/spack/modules/Core ----------------------------------------------------
   amdlibflame/5.0-qnlh6c              gatk/4.1.2.0-ryihs7                           openslide/3.4.1-lq34bt
   anaconda3/2021.11-fqibig            gatk/4.1.4.0-dwdsjt                           p7zip/17.05-nn5izq
   anaconda3/2022.10-x3avg2            gatk/4.1.4.0-t6lr77                           packmol/20.0.0-oujfey
   anaconda3/2023.09-0-amgrwv   (D)    gatk/4.2.2.0-m57fnu                           paml/4.10.3-bwkrsj
   ants/2.4.3-bor5ky                   gatk/4.4.0.0-m7dcto                           pandoc/2.19.2-do5ryv
   apptainer/1.1.9-zjyglv              gatk/4.5.0.0-sl3qs3                    (D)    pangolin/master-hivha4
   aspera-cli/3.7.7-jkl6wp             gblocks/0.91b-c5xybu                          parallel/20220522-wbuewg
   augustus/3.5.0-k2zdam               gcc/7.5.0-xp7vam                              pcre/8.40-2v6t57

...

You can find the available versions of a given package with module spider package_name

[user@login2 ~] : module spider python

----------------------------------------------------------------------------------------------------------------------------------------
  python:
----------------------------------------------------------------------------------------------------------------------------------------
     Versions:
        python/ondemand-jupyter-python3.13.5
        python/pytorch_251_311_cu124
        python/tensorflow_218_311
        python/3.7.0-52qxuq
        python/3.7.17-d53vam
        python/3.7.17-4pgmy5
        python/3.8.18-5jdolp
        python/3.9.18-oe3og4
        python/3.10.13-riplp2
        python/3.11.6-wnbaq7
        python/3.11.9-k3zy4r
        python/3.12.0-ig3l6e
        python/3.12.8-ydargp

...

Attempt to load the package into your environment with module load

[user@login2 ~] : module load python/3.12.8
[user@login2 ~] : module list

Currently Loaded Modules:
  1) python/3.12.8-ydargp

You may see newer versions of the packages in your workflow. This may be a good time to update your workflow to use them, or at least see if things work as is with a newer version. 

If a module does not have your required version, fails to load, or is missing outright:

Submit a Support Ticket with the exact module name/version you need and any error messages encountered. We will work with you to install missing modules and dependencies, prioritizing reported issues.

Updating your Job Submission Script

To adapt a job submission script to work on the rh9 cluster, you will need to adjust the following items:

  • Cluster specification: #SBATCH --cluster=rh9
  • Partition Specification: #SBATCH --partition=cpu (or gpu, mpi)
  • Module commands: You should only attempt to load module names/versions that you have verified exist in the rh9 environment.

Frequently Asked Questions

Q1: When will the remaining hardware from the RHEL 7 environment be brought over to RHEL 9? The quarterly cluster maintenance on Tuesday, August 19th is when we intend to transition the rest of the hardware. 

Q2: I compiled my own software under the RHEL 7 environment, will I need to recompile it under RHEL9? Yes, the software stack we've made available has mostly been reinstalled under RHEL9, so it is best to assume your software will need to be recompiled as well. 

Q3: Do I need to move data into this cluster context for testing? No, the usual filesystems (iX, ihome, etc.) are mounted and available to the RHEL 9 login and compute nodes. 

Q4: What is the string of characters in the module names after the name and version? A large amount of the packages were reinstalled using an HPC package management tool called Spack. This tool automatically generates modulefiles compatible with LMOD, and uses characters from any given installation's unique hash to prevent naming conflicts in the module files.