Multi-Node Scalability

SPEC HPC 2021 Small Benchmark Results

These benchmarks demonstrate the scaling characteristics of our primary compute platform (Dell R6625 with AMD EPYC 9754) using hybrid MPI+OpenMP parallelization across multiple nodes.

Scaling Results by Node Count

Scheduling

Parallelization

Nodes

Ranks

Threads

Base Score

Full Report

SLURM

MPI+OpenMP

1

32

8

1.47

PDF

SLURM

MPI+OpenMP

2

64

8

2.84

PDF

SLURM

MPI+OpenMP

4

128

8

5.53

PDF

SLURM

MPI+OpenMP

8

256

8

11.6

PDF

SLURM

MPI+OpenMP

16

512

8

21.2

PDF

SLURM

MPI+OpenMP

32

1024

8

33.9

PDF

Scalability Analysis

The following plots compare our system’s scalability against published results from other HPC centers.

Core Count Scaling

SPEC HPC 2021 Scaling by Core Count

This plot shows how performance scales as the number of CPU cores increases within our system, compared to published results from other facilities.

Node Count Scaling

SPEC HPC 2021 Scaling by Node Count

This plot demonstrates scaling efficiency as additional compute nodes are added to the job, illustrating the effectiveness of our interconnect and parallel I/O infrastructure.

Hardware and Software Specifications

The multi-node benchmarks were conducted on the following system configuration:

Hardware

  • System: Dell PowerEdge R6625 with Immersion Cooling (Direct Liquid Cooling)

  • Processor: 2x AMD EPYC 9754 128-Core (256 cores/node, 2.25-3.10 GHz, HT Disabled)

  • Memory: 768 GB DDR5-4800 per node (24x 32GB DIMMs)

  • Interconnect: Mellanox ConnectX-6 HDR (200 Gbit/s RoCE v2)

  • Storage: Dell OneFS via NFS v3

Software Stack

  • Compiler: Intel oneAPI DPC++/C++ Compiler 2025.0.4

  • MPI Library: Open MPI 5.0.6

  • Operating System: Rocky Linux 9.5 (kernel 5.14.0-503.40.1.el9_5)

  • Job Scheduler: SLURM Workload Manager

Compiler Flags

  • C/C++: -march=common-avx512 -Ofast -flto -ffast-math -mprefer-vector-width=512 -qopenmp -ansi-alias

  • Fortran: -march=common-avx512 -Ofast -flto -ffast-math -mprefer-vector-width=512 -qopenmp -nostandard-realloc-lhs -align array64byte

Detailed configuration including runtime environment variables, build logs, and tuning parameters are available in the PDF reports.