Computational Facilities (Supporting Document for Writing Grant Proposal)

Abbreviated version

The Center for Research Computing (CRC) provides researchers at the University access to shared computing hardware, software, and personalized project consultation.  CRC provides different types of hardware for different advanced computing needs, including several clusters serving MPI (message passing interface) jobs, symmetric multiprocessing (SMP) on a single node, HTC (high throughput computing) workflows, and non-traditional computation using GPUs and emerging manycore CPUs.

Details of each kind of hardware are as follows: 104 nodes of 24-core Xeon Gold 6126 2.60 GHz (Skylake), 29 nodes of 12-core Xeon E5-2643v4 3.40 GHz (Broadwell), 20 nodes of 16-core Intel Xeon E5-2630v3, 2.4GHz (Haswell-EP), 96 nodes of 28-core Intel Xeon E5-2690 2.60 GHz (Broadwell), 32 nodes of 20-core Haswell (E5-2660 v3) 2.6 GHz (Haswell), and 24 nodes of non-traditional architecture with variety on high-end GPUs. These nodes have shared memory with up to 512GB of shared memory.

The systems are housed at an enterprise-level data center and are administered jointly with central IT, Computing Services and Systems Development (CSSD). CSSD maintains the critical environmental infrastructure (power, cooling, networking) and administers the cluster operating systems and storage backups. Connectivity between the data center and main campus is via two 100Gbps fibers and to Internet2 via 100Gbps. The global storage is comprised of a 130TB Isilon home space (which is backed up), a 450 TB Lustre parallel filesystem, and 1PB ZFS filesystem for archival storage.

 

Detailed version

Access to computing hardware, software, and research consulting are provided through the Center for Research Computing (CRC) (www.crc.pitt.edu). CRC provides in-house high performance computing (HPC) resources allocated for shared use for campus researchers.

CRC provides different types of hardware for different advanced computing needs. First, we describe the characteristics of each compute cluster. Details of each kind of hardware are listed further below.

  • SMP nodes are appropriate for programs that are parallelized using the shared memory framework. They are also appropriate for those who want to move up to supercomputers while keeping the programming style of their laptops, e.g want to run MATLAB. These nodes have shared memory with up to 512GB of shared memory.
  • HTC nodes are designed for High Throughput Computing workflows such as sequence analysis and some data-intensive analytics.
  • MPI nodes are for tightly-coupled codes that are parallelized using the Message Passing Interface (MPI) and benefit from the low-latency Omni-Path (OP) or Infiniband (IB) interconnect fabrics.
  • NTA (non-traditional architecture) nodes for applications written to take advantage of architectures, such as NVIDIA GPUs and Intel Knights Landing Multi-core CPUs.
  • All of the above styles are also supported on the Legacy cluster, which is comprised of older heterogeneous architectures.

Detailed hardware specification


SMP-Standard
100 nodes of 24-core Xeon Gold 6126 2.60 GHz (Skylake)
192 GB RAM
256 GB SSD & 500 GB SSD
10GigE

SMP-Specialty
24 nodes of 12-core Xeon E5-2643v4 3.40 GHz (Broadwell)
256 GB RAM
256 GB SSD & 1 TB SSD
10GigE

2 nodes of 12-core Xeon E5-2643v4 3.40 GHz (Broadwell)
256 GB RAM
256 GB SSD & 3 TB SSD
10GigE

2 nodes of 12-core Xeon E5-2643v4 3.40 GHz (Broadwell)
512 GB RAM
256 GB SSD & 3 TB SSD
10GigE

1 node of 12-core Xeon E5-2643v4 3.40 GHz (Broadwell)
256 GB RAM
256 GB SSD & 6 TB NVMe
10 GigE

HTC
4 nodes of 24-core Xeon Gold 6126 2.60 GHz (Skylake)
384 GB RAM
256 GB SSD & 500 GB SSD
FDR Infiniband

20 nodes of 16-core Intel Xeon E5-2630v3, 2.4GHz (Haswell-EP)
256 GB RAM
256 GB SSD
FDR Infiniband

MPI-OP
96 nodes of 28-core Intel Xeon E5-2690 2.60 GHz (Broadwell)
64 GB RAM/node
256 GB SSD
100 Gb Omni-Path

MPI-IB
32 nodes of 20-core Haswell (E5-2660 v3) 2.6 GHz (Haswell)
128 GB RAM/node
256 GB SSD
FDR InfiniBand

NTA
7 nodes with 4 NVIDIA Titan X GPGPUs/node
8 nodes with 4 NVIDIA GeForce GTX 1080 GPGPUs/node
1 node with 2 NVIDIA K40 GPGPUs
8 nodes of Intel KNL

Legacy
20 nodes with 16-core Intel Ivy Bridge (E5-2650v2) 2.6 GHz, 64 GB of memory, 1 TB
HDD, and FDR IB.
24 nodes with 16-core Intel Sandy Bridge (E5-2650) 2.6 GHz, 128 GB of memory, 1TB
HDD, and FDR IB.
82 nodes with 16-core Intel Sandy Bridge (E5-2670) 2.6 GHz. 36 have 32 GB of RAM,1
TB HDD, connected by FDR IB. 36 have 64 GB of RAM, 1 TB HDD, connected by FDR IB. 8 have 64 GB of RAM, 2 TB HDD, connected by GigE. 2 have 128 GB of RAM, 3 TB HDD, connected by FDR IB.
54 nodes with 64-core AMD Interlagos (Opteron 6276) 2.3 GHz, QDR IB, 2TB HDD. 18 nodes have 256 GB RAM. 36 nodes have 128 GB of RAM.
1 node with 8-core Intel Sandy Bridge (E5-2643), 128 GB RAM, 3TB SSD
1 node with 12-core Haswell (E5-2620 v3) 2.4 GHz, 128 GB RAM, 2 x 250 GB HDD, 2×800 GB SSD

The cluster compute nodes were purchased with funds provided by the University, by faculty researchers, and by a NSF Major Research Instrumentation grant. The systems are housed at the enterprise-level Network Operations Center (NOC) and are administered jointly with Computing Services and Systems Development (CSSD). CSSD maintains the critical environmental infrastructure (power, cooling, networking) and administers the cluster operating systems and storage backups. CRC interfaces directly with researchers and provides software installation services, training workshops, and personalized consultation on improving software design/performance and computational workflows. The road map for research computing infrastructure are developed jointly by CRC and CSSD to meet the emerging needs of researchers at the University.
Connectivity between the NOC and main campus is via two 100Gbps fibers and to Internet2 via 100Gbps. The global storage is comprised of a 130TB Isilon home space (which is backed up), 80 TB of standard NFS home space, a 450 TB Lustre parallel filesystem, and 1PB ZFS filesystem for archival.
This infrastructure is designed for future scaling via additional resources funded by research instrumentation grants, internal University funds, or faculty contributions from grants or start-up funds.
The cluster operating systems are Redhat Enterprise Linux 6 and 7. A very wide range of major software packages are licensed and installed on the cluster, ranging from quantum mechanics (e.g., Gaussian, Molpro, VASP, CP2K, QMC), to classical mechanics (e.g., NAMD, LAMMPS, Amber), to continuum mechanics (e.g., Abaqus, ANSYS, COMSOL, Lumerical), to genomics analysis suits (e.g., Tophat/Bowtie, CLCb Genomics Server).