Skip to main content

Memory & Resource Estimation

Specifying the right amount of memory is important. Too little and your job fails with OOM. Too much and you block resources from other users.

Memory Directives

#SBATCH --mem=4G           # Total memory for the entire job
#SBATCH --mem-per-cpu=2G   # Memory per CPU core (use one or the other, not both)

Prefer --mem-per-cpu when your job scales with CPU count. Use --mem for fixed memory requirements.

Checking Memory Usage of Past Jobs

sacct -j job_id --format=JobID,JobName,MaxRSS,Elapsed

MaxRSS is the peak memory used. Use this to tune future submissions.

Monitoring a Running Job

#!/bin/bash
#SBATCH --job-name=memory_test
#SBATCH --account=public-users_v2
#SBATCH --partition=power-general-shared-pool
#SBATCH --qos=public
#SBATCH --time=01:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=4G
#SBATCH --output=memory_test.out

echo "Memory before:"
free -m

./your_application

echo "Memory after:"
free -m

Tips for Estimating Memory

  • Start conservative, check MaxRSS via sacct, tune upward
  • Check application documentation for memory recommendations
  • Run a small test job first before scaling up
  • Use free -m, top, or htop inside an interactive job to observe live usage
  • Plan for peak usage — memory spikes during data loading or processing

OOM Error

If your job fails with out-of-memory, you'll see:

sacct -j job_id -o JobID,JobName,State%20

JobID    JobName               State
-------- -------------------- --------------------
71       my_job        OUT_OF_MEMORY

Resubmit with a higher --mem or --mem-per-cpu value.