Running Interactive Container Sessions on HPC

Last updated: 2025-06-13
Keywords: container, interactive, development, nvidia, enroot
Solution verified

Environment

  • Slurm workload manager

  • GPU-enabled nodes

  • Enroot/Pyxis container runtime

Issue

  • Test or debug container environments

  • Install additional software

  • Develop container-based applications

  • Customize existing containers

Resolution

#. Basic Interactive Container Session

Start an interactive container session using the following command:

$ srun --account=[YOUR_ACCOUNT] \
    --partition=normal \
    --nodes=1 \
    --ntasks-per-node=1 \
    --gpus-per-node=1 \
    --cpus-per-task=28 \
    --container-writable \
    --container-remap-root \
    --no-container-mount-home \
    --container-image nvcr.io#nvidia/nvhpc:24.3-devel-cuda12.3-ubuntu22.04 \
    --container-save $HOME/my-container.sqsh \
    --pty bash

Note

  • Changes will be lost without --container-save, see Saving Enroot container failed for more details.

  • Root access requires --container-remap-root and --container-writable

  • Interactive sessions have a maximum walltime of 4 hours on HPC4 and 2 hours on SuperPOD

  • Create the target directory first: mkdir -p $HOME/containers if saving to a subdirectory

#. Container Customization and Package Installation

Once inside the container, update and install packages:

root@node:/# apt update
root@node:/# apt install -y [package-name]

Common packages for development:

root@node:/# apt install -y vim git wget curl build-essential python3-pip
root@node:/# pip3 install numpy matplotlib jupyter

#. Using Previously Saved Containers

To start with a previously saved container, use --container-image /path/to/container/image.sqsh instead of pulling from a registry:

$ srun --account=[YOUR_ACCOUNT] \
    --partition=normal \
    --nodes=1 \
    --ntasks-per-node=1 \
    --gpus-per-node=1 \
    --cpus-per-task=28 \
    --container-writable \
    --container-remap-root \
    --no-container-mount-home \
    --container-image $HOME/my-container.sqsh \
    --container-save $HOME/my-container-updated.sqsh \
    --pty bash

Best Practices

  • Container Storage: Store containers in $HOME/containers/ for organization

  • Naming Convention: Use descriptive names: pytorch-24.03-custom.sqsh

  • Version Control: Save incremental versions during development

  • Resource Planning: Request appropriate CPU/GPU/memory based on workload

References