Memory & Resource Estimation
Specifying the right amount of memory is important. Too little and your job fails with OOM. Too much and you block resources from other users.
Memory Directives
#SBATCH --mem=4G # Total memory for the entire job
#SBATCH --mem-per-cpu=2G # Memory per CPU core (use one or the other, not both)
Prefer --mem-per-cpu when your job scales with CPU count. Use --mem for fixed memory requirements.
Checking Memory Usage of Past Jobs
sacct -j job_id --format=JobID,JobName,MaxRSS,Elapsed
MaxRSS is the peak memory used. Use this to tune future submissions.
Monitoring a Running Job
#!/bin/bash
#SBATCH --job-name=memory_test
#SBATCH --account=public-users_v2
#SBATCH --partition=power-general-shared-pool
#SBATCH --qos=public
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --output=memory_test.out
echo "Memory before:"
free -m
./your_application
echo "Memory after:"
free -m
Tips for Estimating Memory
- Start conservative, check
MaxRSSviasacct, tune upward - Check application documentation for memory recommendations
- Run a small test job first before scaling up
- Use
free -m,top, orhtopinside an interactive job to observe live usage - Plan for peak usage — memory spikes during data loading or processing
OOM Error
If your job fails with out-of-memory, you'll see:
sacct -j job_id -o JobID,JobName,State%20
JobID JobName State
-------- -------------------- --------------------
71 my_job OUT_OF_MEMORY
Resubmit with a higher --mem or --mem-per-cpu value.