AMD 9004 CPU

AMD EPYC 9004 series (Genoa/Bergamo) introduces several key features that benefit HPC workloads

  • Zen 4 cores: Up to 128 cores per socket with improved IPC (Instructions Per Clock)

  • DDR5 support: Up to DDR5-4800 with 12 memory channels per socket

  • Higher PCIe lane count: 128 lanes of PCIe 5.0 for high-speed I/O and accelerators

  • Advanced Vector Extensions (AVX-512): Full AVX-512 support for vectorized workloads

  • 3D V-Cache (select models): Additional L3 cache for memory-intensive applications

  • Improved NUMA topology: Better memory locality with configurable NUMA domains

  • Enhanced security: Hardware-level security features without performance penalty

BIOS Config

Optimal BIOS configuration is crucial for achieving maximum performance from AMD EPYC processors in HPC environments.

The settings below are tuned for based on a variety of HPC/AI benchmarks, this should serve as a good starting point for most workloads. For optimal performance on specific workload, one may further finetune based on an application-specific benchmark.

After making changes, one should always perform a full power cycle to ensure all BIOS settings are applied, then confirm the number of cores reported in BIOS and OS.

Warning

There maybe a BIOS bug with some setting combinations where the reported number of cores is incorrect after performing a full power cycle.

Dell OpenManage BIOS

Dell OpenManage provides XML-based configuration for Dell PowerEdge servers. For AMD EPYC systems, you can use the following XML snippet to configure BIOS settings:

<root>
  <Attribute Name="ApbDis">Enabled</Attribute>
  <Attribute Name="CcxAsNumaDomain">Enabled</Attribute>
  <Attribute Name="DeterminismSlider">PerformanceDeterminism</Attribute>
  <Attribute Name="DfCState">Enabled</Attribute>
  <Attribute Name="DfPstateFreqOptimizer">Enabled</Attribute>
  <Attribute Name="DfPstateLatencyOptimizer">Enabled</Attribute>
  <Attribute Name="DlwmForcedWidth">x16</Attribute>
  <Attribute Name="DramRefreshDelay">Performance</Attribute>
  <Attribute Name="DynamicLinkWidthManagement">Force</Attribute>
  <Attribute Name="FixedSocPstate">FixedSocPstate0</Attribute>
  <Attribute Name="Hsmp">Enabled</Attribute>
  <Attribute Name="IommuSupport">Enabled</Attribute>
  <Attribute Name="MemFrequency">MaxPerf</Attribute>
  <Attribute Name="MemPatrolScrub">Standard</Attribute>
  <Attribute Name="MemRefreshRate">1x</Attribute>
  <Attribute Name="NumaNodesPerSocket">2</Attribute>
  <Attribute Name="PcieAspmL1">Disabled</Attribute>
  <Attribute Name="PcieSpeedPmmControl">StaticLinkSpeedGen5</Attribute>
  <Attribute Name="PowerProfileSelect">HighPerformanceMode</Attribute>
  <Attribute Name="ProcCStates">Enabled</Attribute>
  <Attribute Name="ProcPwrPerf">OsDbpm</Attribute>
  <Attribute Name="ProcTurboMode">Enabled</Attribute>
  <Attribute Name="SysProfile">Custom</Attribute>
</root>

AMI Aptio BIOS

Note

Some settings show Auto may imply the same value as the explicit setting Enabled. We list explicit values to ensure consistency across different BIOS versions.

AMD CBS (Custom BIOS Settings):

Setting Name

Value

SMU Common Options

Power Policy Quick Setting

Best Performance

Determinism

Performance

APBDIS

1 (Enabled)

DfPstateMin

0

DfPstateMax

2

DF PState Frequency Optimizer

Enabled

DF Cstates

Enabled

CPPC

Disabled

HSMP Support

Enabled

NBIO Common Options

IOMMU

Enabled

DF Common Options

NUMA Nodes Per Socket

2 (or NPS2)

ACPI SRAT L3 Cache As NUMA Domain

Enabled

Memory interleaving

Auto

CPU Common Options

Prefetcher settings

All enabled

Streaming Stores Control

Enabled

Local APIC Mode

x2APIC

Fast Short REP MOVSB

Enabled

Enhanced REP MOVSB/STOSB

Enabled

AVX512

Enabled

MONITOR and MWAIT disable

Disabled

Corrector Branch Predictor

Enabled

PAUSE Delay

16 cycles (minimal)

CPU Speculative Store Modes

More Speculative

Prefetch/Request Throttle

Enabled

Kernel Parameters

To optimize the performance of AMD EPYC processors, you can use specific kernel parameters. These parameters can be added to the kernel command line in your bootloader configuration (e.g., GRUB).

amd_pstate=active iommu=pt
  • amd_pstate=active: Enables the AMD P-State driver, which provides OS-level control over CPU frequency and power management.

  • iommu=pt: Enables pass-through mode for better performance with virtual machines and containers.

AMD-specific Kernel Modules

Specific kernel version provides additional AMD-specific modules that enhance performance and functionality, below lists the modules available in different kernel versions.

  • amd_atl: AMD Address Translation Library for enhanced memory management

  • ptdma: Platform DMA driver for improved data movement

  • ae4dma: Advanced Enhanced DMA driver for next-generation AMD platforms

AMD Kernel Modules

Required Kernel Version

RHEL 9 Backport (Kernel 5.14)

amd_atl

6.1

el9_4

ptdma

6.8 (TBC)

el9_7 (TBC)

ae4dma

6.14

Unknown

References