Saving Enroot container failed
Environment
Slurm workload manager
Enroot container runtime
NVIDIA GPU compute nodes
Container-enabled cluster
Issue
When using pyxis/enroot container, saving fails with errors such as
slurmstepd: error: pyxis: [ERROR] No such file or directory: /home/username/example/nvhpc:24.3.sqsh
slurmstepd: error: pyxis: failed to export container pyxis_174632.0 to /home/username/example/nvhpc:24.3.sqsh
Resolution
Create container directory:
$ mkdir -p $HOME/containers
Run container with correct save path:
$ srun --account=YOUR_ACCOUNT \
--nodes=1 \
--gpus-per-node=1 \
--container-writable \
--container-save $HOME/containers/nvhpc.sqsh \
--container-image nvcr.io#nvidia/nvhpc:24.3-devel-cuda12.3-ubuntu22.04 \
--pty bash
Warning
Ensure sufficient disk quota before saving large containers
Container names should not contain special characters
Verify saved container:
$ ls -l $HOME/containers/nvhpc.sqsh
Note
Parent directory must exist before running container
Use absolute paths for –container-save
Saved container can be used with –container-image /path/to/container.sqsh
Root Cause
Export can fail when
Target directory doesn’t exist
Path contains illegal characters
Insufficient permissions or disk space / quota