AMD 9004 CPU ============ AMD EPYC 9004 series (Genoa/Bergamo) introduces several key features that benefit HPC workloads - **Zen 4 cores**: Up to 128 cores per socket with improved IPC (Instructions Per Clock) - **DDR5 support**: Up to DDR5-4800 with 12 memory channels per socket - **Higher PCIe lane count**: 128 lanes of PCIe 5.0 for high-speed I/O and accelerators - **Advanced Vector Extensions (AVX-512)**: Full AVX-512 support for vectorized workloads - **3D V-Cache** (select models): Additional L3 cache for memory-intensive applications - **Improved NUMA topology**: Better memory locality with configurable NUMA domains - **Enhanced security**: Hardware-level security features without performance penalty BIOS Config ----------- Optimal BIOS configuration is crucial for achieving maximum performance from AMD EPYC processors in HPC environments. The settings below are tuned for based on a variety of HPC/AI benchmarks, this should serve as a good starting point for most workloads. For optimal performance on specific workload, one may further finetune based on an application-specific benchmark. After making changes, one should always perform a full power cycle to ensure all BIOS settings are applied, then confirm the number of cores reported in BIOS and OS. .. warning:: There maybe a BIOS bug with some setting combinations where the reported number of cores is incorrect after **performing a full power cycle**. Dell OpenManage BIOS ~~~~~~~~~~~~~~~~~~~~ Dell OpenManage provides XML-based configuration for Dell PowerEdge servers. For AMD EPYC systems, you can use the following XML snippet to configure BIOS settings: .. code-block:: xml Enabled Enabled PerformanceDeterminism Enabled Enabled Enabled x16 Performance Force FixedSocPstate0 Enabled Enabled MaxPerf Standard 1x 2 Disabled StaticLinkSpeedGen5 HighPerformanceMode Enabled OsDbpm Enabled Custom AMI Aptio BIOS ~~~~~~~~~~~~~~ .. note:: Some settings show ``Auto`` may imply the same value as the explicit setting ``Enabled``. We list explicit values to ensure consistency across different BIOS versions. **AMD CBS (Custom BIOS Settings)**: .. list-table:: :header-rows: 1 :widths: 70 30 - - Setting Name - Value - - **SMU Common Options** - - - Power Policy Quick Setting - Best Performance - - Determinism - Performance - - APBDIS - 1 (Enabled) - - DfPstateMin - 0 - - DfPstateMax - 2 - - DF PState Frequency Optimizer - Enabled - - DF Cstates - Enabled - - CPPC - Disabled - - HSMP Support - Enabled - - **NBIO Common Options** - - - IOMMU - Enabled - - **DF Common Options** - - - NUMA Nodes Per Socket - 2 (or NPS2) - - ACPI SRAT L3 Cache As NUMA Domain - Enabled - - Memory interleaving - Auto - - **CPU Common Options** - - - Prefetcher settings - All enabled - - Streaming Stores Control - Enabled - - Local APIC Mode - x2APIC - - Fast Short REP MOVSB - Enabled - - Enhanced REP MOVSB/STOSB - Enabled - - AVX512 - Enabled - - MONITOR and MWAIT disable - Disabled - - Corrector Branch Predictor - Enabled - - PAUSE Delay - 16 cycles (minimal) - - CPU Speculative Store Modes - More Speculative - - Prefetch/Request Throttle - Enabled Kernel Parameters ----------------- To optimize the performance of AMD EPYC processors, you can use specific kernel parameters. These parameters can be added to the kernel command line in your bootloader configuration (e.g., GRUB). .. code-block:: bash amd_pstate=active iommu=pt - ``amd_pstate=active``: Enables the AMD P-State driver, which provides OS-level control over CPU frequency and power management. - ``iommu=pt``: Enables pass-through mode for better performance with virtual machines and containers. AMD-specific Kernel Modules --------------------------- Specific kernel version provides additional AMD-specific modules that enhance performance and functionality, below lists the modules available in different kernel versions. - ``amd_atl``: AMD Address Translation Library for enhanced memory management - ``ptdma``: Platform DMA driver for improved data movement - ``ae4dma``: Advanced Enhanced DMA driver for next-generation AMD platforms .. list-table:: :header-rows: 1 :widths: 25 25 50 - - AMD Kernel Modules - Required Kernel Version - RHEL 9 Backport (Kernel 5.14) - - ``amd_atl`` - 6.1 - el9_4 - - ``ptdma`` - 6.8 (TBC) - el9_7 (TBC) - - ``ae4dma`` - 6.14 - Unknown References ---------- - `AMD EPYC 9004 Tuning Guide `_ - `AMD EPYC 9004 HPC Tuning Guide `_ - `NVIDIA NGC Multi-node Performance Tuning `_