Cluster Hardware Overview

The Pitt Center for Research Computing provides different types of hardware for various advanced computing needs.

The CRC's primary distinction between hardware is called a "cluster". A cluster is a set of computers that implement some specialized High Performance Computing (HPC) trait like the parallelism of graphics processing units, or message passing computing architectures. The HPC clusters available to CRC users are:

  • Message Passing Interface (MPI) for highly parallel computing across many machines.
  • High Throughput Computing (HTC) for processing data in large quantities or for long periods of time.
  • Shared Memory Processing (SMP) for efficient exchange and access to data with a common memory space.
  • Graphics Processing Units (GPU) for accelerated computing with GPU applications.
  • Visualization and Interactive Desktop (VIZ) for projects requiring a graphical user interface.

These clusters can be further split into "partitions" of machines with similar hardware specifications (processors, memory, etc.). The individual machines that make up the clusters and that users can submit their jobs to are called "compute nodes".

Below, you will find the hardware specifications for each cluster and the partitions that compose it.

       

MPI Cluster

The MPI nodes are for tightly-coupled codes that are parallelized using the Message Passing Interface (MPI) and benefit from low-latency communication through an Infiniband or Omni-Path network.

 

Partition: mpi

This partition supports multi-node MPI jobs across a 200 Gb/sec Infiniband network. Your job must request a minimum of 2 nodes.

  • 136 nodes with dual socket Intel Xeon Gold 6342 CPU (Ice Lake, 24C, 2.80GHz base, up to 3.50GHz max boost)
    • 48 cores/node
    • 512GB RAM/node (10.6GB/core)
    • 480GB NVMe for OS and 1.6TB NVMe for local scratch
    • HDR200 Infiniband (200 Gb/sec)
    • 10/25 GbE

 

Partition: opa-high-mem

This default partition supports multi-node MPI jobs across a 100 Gb/sec Intel Omni-Path network. Your job must request a minimum of 2 nodes.

  • 36 nodes with dual socket Intel Xeon Gold 6132 CPU (Skylake, 14C, 2.60GHz base, up to 3.70GHz max boost)
    • 28 cores/node
    • 192GB RAM/node (6.8GB/core)
    • 256GB SSD for OS and 500GB SSD for local scratch
    • Omni-Path (100 Gb/sec)

 

HTC Cluster

These nodes are designed for High Throughput Computing workflows such as gene sequence analysis, neuroimaging data processing, and other data-intensive analytics.

 

Partition: htc

  • 18 nodes with dual socket Intel Xeon Platinum 8352Y CPU (Ice Lake, 32C, 2.20GHz base, up to 3.40GHz Max Turbo)
    • 64 cores/node
    • 512GB RAM/node (8GB/core)
    • 2TB NVMe drive for local scratch
    • 10 GbE
  • 4 nodes with dual socket Intel Xeon Platinum 8352Y CPU (Ice Lake, 32C, 2.20GHz base, up to 3.40GHz Max Turbo)
    • 64 cores/node
    • 1024GB RAM/node (16GB/core)
    • 2TB NVMe drive for local scratch
    • 10 GbE
  • 8 nodes with dual socket Intel Xeon Gold 6248R (Cascade Lake Refresh, 24C, 3.00GHz base, up to 4.00GHz Max Turbo)
    • 48 cores/node
    • 768GB RAM/node (16 GB/core)
    • 480GB SSD for OS and 960 GB SSD for local scratch
    • 10 GbE

 

SMP Cluster

Nodes that allow for shared memory processing. SMP nodes are appropriate for programs that are parallelized using the shared memory framework. They are also appropriate for those who want to move up to supercomputers while keeping the programming style of their laptops, such as running MATLAB.

 

Partition: smp

  • 58 nodes with dual socket AMD EPYC 7302 CPU (Rome, 16C, 3.0GHz base, up to 3.3GHz Max Boost)
    • 32 cores/node
    • 256GB RAM/node (8GB/core)
    • 256GB SSD for OS and  1TB SSD for local scratch
    • 10 GbE
  • 132 nodes with dual socket Intel Xeon Gold 6126 CPU (Skylake, 12C, 2.60GHz base, up to 3.70GHz Max Turbo)
    • 24 cores/node
    • 192GB RAM/node (8GB/core)
    • 256GB SSD for OS and 500GB SSD for local scratch
    • 10 GbE

 

Partition: high-mem

  • 8 nodes with dual socket Intel Xeon Platinum 8352Y CPU (Ice Lake, 32C, 2.20GHz base, up to 3.40GHz Max Turbo) 
    • 64 cores/node
    • 1024GB RAM/node (16GB/core)
    • 480GB SSD for OS and 10TB NVMe for local scratch
    • 10/25 GbE
  • 2 nodes with dual socket Intel Xeon Platinum 8352Y CPU (Ice Lake, 32C, 2.20GHz base, up to 3.40GHz Max Turbo) 
    • 64 cores/node
    • 2048GB RAM/node (32GB/core)
    • 480GB SSD for OS and 10TB NVMe for local scratch
    • 10/25 GbE
  • 1 node with dual socket AMD EPYC 7351 (Naples, 16C, 2.4GHz base, up to 2.9GHz Max Boost)
    • 32 core/node
    • 1024GB RAM/node (32GB/core)
    • 256GB SSD for OS and 1TB NVMe for local scratch
    • 10 GbE
  • 4 nodes with quad socket Intel Xeon E7-8870v4 CPU (Broadwell, 20C, 2.10GHz base, up to 3.00GHz Max Turbo) 
    • 80 cores/node
    • 3072GB RAM/node (38GB/core)
    • 256GB SSD for OS and 5TB SSD for local scratch
    • 10 GbE
  • 24 nodes with dual socket Intel Xeon E5-2643v4 CPU (Broadwell, 6C, 3.40GHz base, up to 3.70GHz Max Turbo) 
    • 12 cores/node
    • 256GB RAM/node (21GB/core)
    • 256GB SSD for OS and 1TB SSD for local scratch
    • 10 GbE
  • 2 nodes with dual socket Intel Xeon E5-2643v4 CPU (Broadwell, 6C, 3.40GHz base, up to 3.70GHz Max Turbo)  
    • 12 cores/node
    • 256GB RAM/node (21GB/core)
    • 256GB SSD for OS and 3TB SSD for local scratch
    • 10 GbE
  • 2 nodes with dual socket Intel Xeon E5-2643v4 CPU (Broadwell, 6C, 3.40GHz base, up to 3.70GHz Max Turbo) 
    • 12 cores/node
    • 512GB RAM/node (42GB/core)
    • 256GB SSD for OS and 3TB SSD for local scratch
    • 10 GbE
  • 1 node with dual socket Intel Xeon E5-2643v4 CPU (Broadwell, 6C, 3.40GHz base, up to 3.70GHz Max Turbo) 
    • 12 cores/node
    • 256GB RAM/node (21GB/core)
    • 256GB SSD for OS and 6TB NVMe for local scratch
    • 10 GbE

 

GPU Cluster

Nodes that enable accelerated computing using Graphics Processing Units. GPU nodes are targeted for applications specifically written to take advantage of the inherent parallelism in general purpose graphics processing unit architectures. For small problems, any of the GPUs below will suffice.

 

Partition: a100

This is the default partition in the gpu cluster and is comprised of the following hardware below. To request a particular Feature (such as an Intel host CPU), add the the following directive to your job script: #SBATCH --constraint=intel

  • 10 nodes with a single socket 2nd-Gen AMD EPYC 7742 host CPU (Rome, 64C, 2.25GHz base, up to 3.4GHz max boost)
    • 64 cores/node
    • 512GB RAM/node (8GB/core)
    • 4 NVIDIA A100 40GB PCIe
    • Max of 16 CPUs per GPU
    • Feature=amd,40g
  • 2 nodes with dual socket Intel Xeon Gold 5220R (Cascade Lake Refresh, 24C, 2.20GHz base, up to 4.00GHz max boost)
    • 48 cores/node
    • 384GB RAM/node (8GB/core)
    • 4 NVIDIA A100 40GB PCIe
    • Max of 12 CPUs per GPU
    • Feature=intel,40g

 

Partition: a100_multi

This partition supports multi-node GPU workflows. Your job must request a minimum of 2 nodes and 8 GPUs. 

  • 10 nodes with a single socket 2nd-Gen AMD EPYC 7742 host CPU (Rome, 64C, 2.25GHz base, up to 3.4GHz max boost)
    • 64 cores/node
    • 512GB RAM/node (8GB/core)
    • 4 NVIDIA A100 40GB PCIe
    • Max of 16 CPUs per GPU
    • Feature=amd,40g

 

Partition: a100_nvlink

This partition supports multi-GPU computation through 8-way A100s that are tightly coupled through an NVLink switch. The details of our Nvidia HGX platform are described below. To request a particular Feature (such as A100 with 80GB GPU memory), add the the following directive to your job script: #SBATCH --constraint=80g

  • 2 nodes with dual socket 2nd-Gen AMD EPYC 7742 host CPU (Rome, 64C, 2.25GHz base, up to 3.4GHz max boost)
    • 128 cores/node
    • 1024GB RAM/node (8GB/core)
    • 8 NVIDIA A100 80GB SXM
    • Max of 16 CPUs per GPU
    • Feature=amd,80g
  • 3 nodes with dual socket 2nd-Gen AMD EPYC 7742 host CPU (Rome, 64C, 2.25GHz base, up to 3.4GHz max boost)
    • 128 cores/node
    • 1024GB RAM/node (8GB/core)
    • 8 NVIDIA A100 40GB SXM
    • Max of 16 CPUs per GPU
    • Feature=amd,40g

 

Partition: gtx1080

9 nodes with dual socket Intel Xeon Silver 4112 (Skylake, 4C, 2.60GHz base, up to 3.00GHz max boost)

  • 8 cores/node
  • 96GB RAM/node
  • 4 NVIDIA GeForce GTX1080 Ti 12GB 
  • Max of 2 CPUs per GPU

 

Partition: v100

A single nodes with dual socket Intel Xeon Gold 6126  (Skylake, 12C, 2.60GHz base, up to 3.70GHz max boost)

  • 24 cores/node
  • 192GB RAM/node
  • 4 NVIDIA V100 32GB PCIe
  • Max of 6 CPUs per GPU

 

Partition: power9

4 nodes of IBM Power System AC922: dual-socket Power9 (16C, 2.7GHz base, 3.3GHz turbo). Code must be compile for the Power9 platform in order to work.

  • 128 threads/node (32 cores x 4 hardware threads)
  • 512GB RAM/node (4GB/core)
  • 4 NVIDIA V100 32GB SXM with NVLink
  • Max of 16 CPUs per GPU

 

 

VIZ Nodes

Nodes equipped with graphical user interface (GUI), especially for visualization projects. (GUI Interface)

dual 14-core Broadwell CPU (Intel Xeon E5-2680 v4 2.4 GHz)

  • 1 node
  • 28 cores/node
  • 256 GB RAM/node
  • 240 GB for OS and 1.6 TB SSD for local scratch
  • 2 NVIDIA GTX 1080 Graphic Cards
  • 10 GbE

dual 12-core Intel Xeon 6226 Cascade Lake-SP 2.7 GHz

  • 1 node
  • 24 cores/node
  • 192 GB RAM/node
  • 240 GB for OS and 1.9 TB SSD for local scratch
  • 2 NVIDIA RTX 2080 Ti Graphic Cards
  • 10 GbE