<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://hpcguide.tau.ac.il/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dvory</id>
	<title>HPC Guide - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://hpcguide.tau.ac.il/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dvory"/>
	<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Special:Contributions/Dvory"/>
	<updated>2026-04-19T02:40:23Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.35.5</generator>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1553</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1553"/>
		<updated>2026-01-19T14:47:30Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Feature */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
This can be achieved when adding in the script the following&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CACHEDIR=/tmp/dvory_${SLURM_JOB_ID}&lt;br /&gt;
 &lt;br /&gt;
mkdir -p $CACHEDIR&lt;br /&gt;
sleep 200&lt;br /&gt;
 &lt;br /&gt;
cleanup() {&lt;br /&gt;
  rm -rf -- &amp;quot;$CACHEDIR&amp;quot; || true&lt;br /&gt;
}&lt;br /&gt;
trap cleanup EXIT INT TERM HUP&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Feature===&lt;br /&gt;
You may see all the features, which are defined resources (a.k.a. constraints), when typing the command &amp;#039;features&amp;#039; on the login node&lt;br /&gt;
The following features are available:&lt;br /&gt;
Af3,AMD avx,disk700g ib_bh,ib_dj Intel&lt;br /&gt;
&lt;br /&gt;
Using them is by adding parameter --constraint=&amp;lt;feature&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The meaning of the features:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1. Af3 - nodes which may run alphafold3&lt;br /&gt;
&lt;br /&gt;
2. AMD - Nodes with cpu of amd family&lt;br /&gt;
&lt;br /&gt;
3. avx - Nodes with avx cpu capabilities&lt;br /&gt;
&lt;br /&gt;
4. disk700g - nodes which have at least 700 gb in their local /tmp partition&lt;br /&gt;
&lt;br /&gt;
5. ib_bh - Nodes with infiniband network #1&lt;br /&gt;
&lt;br /&gt;
6. ib_dj - Nodes with infiniband network #2&lt;br /&gt;
&lt;br /&gt;
7. Intel - Nodes with cpu of intel family&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To see all the features you may type &amp;#039;&amp;#039;&amp;#039;features&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1552</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1552"/>
		<updated>2026-01-19T14:47:02Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Feature */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
This can be achieved when adding in the script the following&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CACHEDIR=/tmp/dvory_${SLURM_JOB_ID}&lt;br /&gt;
 &lt;br /&gt;
mkdir -p $CACHEDIR&lt;br /&gt;
sleep 200&lt;br /&gt;
 &lt;br /&gt;
cleanup() {&lt;br /&gt;
  rm -rf -- &amp;quot;$CACHEDIR&amp;quot; || true&lt;br /&gt;
}&lt;br /&gt;
trap cleanup EXIT INT TERM HUP&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Feature===&lt;br /&gt;
You may see all the features, which are defined resources (a.k.a. constraints), when typing the command &amp;#039;features&amp;#039; on the login node&lt;br /&gt;
The following features are available:&lt;br /&gt;
Af3,AMD avx,disk700g ib_bh,ib_dj Intel&lt;br /&gt;
&lt;br /&gt;
Using them is by adding parameter --constraint=&amp;lt;feature&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The meaning of the features:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1. Af3 - nodes which may run alphafold3&lt;br /&gt;
&lt;br /&gt;
2. AMD - Nodes with cpu of amd family&lt;br /&gt;
&lt;br /&gt;
3. avx - Nodes with avx cpu capabilities&lt;br /&gt;
&lt;br /&gt;
4. disk700g - nodes which have at least 700 gb in their local /tmp partition&lt;br /&gt;
&lt;br /&gt;
5. ib_bh - Nodes with infiniband network #1&lt;br /&gt;
&lt;br /&gt;
6. ib_dj - Nodes with infiniband network #2&lt;br /&gt;
&lt;br /&gt;
7. Intel - Nodes with cpu of intel family&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
To see all the features you may type &amp;#039;&amp;#039;&amp;#039;features&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1551</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1551"/>
		<updated>2026-01-19T14:45:26Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Feature */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
This can be achieved when adding in the script the following&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CACHEDIR=/tmp/dvory_${SLURM_JOB_ID}&lt;br /&gt;
 &lt;br /&gt;
mkdir -p $CACHEDIR&lt;br /&gt;
sleep 200&lt;br /&gt;
 &lt;br /&gt;
cleanup() {&lt;br /&gt;
  rm -rf -- &amp;quot;$CACHEDIR&amp;quot; || true&lt;br /&gt;
}&lt;br /&gt;
trap cleanup EXIT INT TERM HUP&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Feature===&lt;br /&gt;
You may see all the features, which are defined resources (a.k.a. constraints), when typing the command &amp;#039;features&amp;#039; on the login node&lt;br /&gt;
The following features are available:&lt;br /&gt;
Af3,AMD avx,disk700g ib_bh,ib_dj Intel&lt;br /&gt;
&lt;br /&gt;
Using them is by adding parameter --constraint=&amp;lt;feature&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The meaning of the features:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
1. Af3 - nodes which may run alphafold3&lt;br /&gt;
&lt;br /&gt;
2. AMD - Nodes with cpu of amd family&lt;br /&gt;
&lt;br /&gt;
3. avx - Nodes with avx cpu capabilities&lt;br /&gt;
&lt;br /&gt;
4. disk700g - nodes which have at least 700 gb in their local /tmp partition&lt;br /&gt;
&lt;br /&gt;
5. ib_bh - Nodes with infiniband network #1&lt;br /&gt;
&lt;br /&gt;
6. ib_dj - Nodes with infiniband network #2&lt;br /&gt;
&lt;br /&gt;
7. Intel - Nodes with cpu of intel family&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
To see all the features you may type &amp;#039;&amp;#039;&amp;#039;features&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1550</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1550"/>
		<updated>2026-01-19T14:39:40Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Using special resources */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
This can be achieved when adding in the script the following&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CACHEDIR=/tmp/dvory_${SLURM_JOB_ID}&lt;br /&gt;
 &lt;br /&gt;
mkdir -p $CACHEDIR&lt;br /&gt;
sleep 200&lt;br /&gt;
 &lt;br /&gt;
cleanup() {&lt;br /&gt;
  rm -rf -- &amp;quot;$CACHEDIR&amp;quot; || true&lt;br /&gt;
}&lt;br /&gt;
trap cleanup EXIT INT TERM HUP&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Feature===&lt;br /&gt;
You may see all the features, which are defined resources (a.k.a. constraints), when typing the command &amp;#039;features&amp;#039; on the login node&lt;br /&gt;
The following features are available:&lt;br /&gt;
Af3,AMD avx,disk700g ib_bh,ib_dj Intel&lt;br /&gt;
&lt;br /&gt;
Using them is by adding parameter --constraint=&amp;lt;feature&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The meaning of the features:&lt;br /&gt;
&lt;br /&gt;
1. Af3 - nodes which may run alphafold3&lt;br /&gt;
&lt;br /&gt;
2. AMD - Nodes with cpu of amd family&lt;br /&gt;
&lt;br /&gt;
3. avx - Nodes with avx cpu capabilities&lt;br /&gt;
&lt;br /&gt;
4. disk700g - nodes which have at least 700 gb in their local /tmp partition&lt;br /&gt;
&lt;br /&gt;
5. ib_bh - Nodes with infiniband network #1&lt;br /&gt;
&lt;br /&gt;
6. ib_dj - Nodes with infiniband network #2&lt;br /&gt;
&lt;br /&gt;
7. Intel - Nodes with cpu of intel family&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1549</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1549"/>
		<updated>2026-01-18T08:47:46Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Using special resources */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
This can be achieved when adding in the script the following&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CACHEDIR=/tmp/dvory_${SLURM_JOB_ID}&lt;br /&gt;
 &lt;br /&gt;
mkdir -p $CACHEDIR&lt;br /&gt;
sleep 200&lt;br /&gt;
 &lt;br /&gt;
cleanup() {&lt;br /&gt;
  rm -rf -- &amp;quot;$CACHEDIR&amp;quot; || true&lt;br /&gt;
}&lt;br /&gt;
trap cleanup EXIT INT TERM HUP&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1548</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1548"/>
		<updated>2026-01-18T08:45:35Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Using special resources==&lt;br /&gt;
There are several defined resources, which can be used in addition to the regular resources (e.g. memory, cpus)&lt;br /&gt;
Thoses resources are:&lt;br /&gt;
GresTypes=gpu,amd,af3,intel,disk700g&lt;br /&gt;
&lt;br /&gt;
When your job needs to use one or more of the above resource, please add &amp;quot;--constraint=&amp;lt;resources list with commas&amp;gt;&amp;quot;&lt;br /&gt;
So for example, one may submit a job using parameter --constraint=disk700g,amd&lt;br /&gt;
And this would mean that the user&amp;#039;s job requires amd processors family and up to 700 gb cache disk space during the job&amp;#039;s run.&lt;br /&gt;
&lt;br /&gt;
If using local disk as cache, need to remove the files when the script ends either normally or abnormally&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1547</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1547"/>
		<updated>2025-12-04T14:41:32Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;We have chatgpt page, which explains it all in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Please use the script &amp;#039;&amp;#039;&amp;#039;check_my_partitions&amp;#039;&amp;#039;&amp;#039; - to find out what are your partitions, accounts and qos&lt;br /&gt;
&lt;br /&gt;
==QOS==&lt;br /&gt;
Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Billing==&lt;br /&gt;
Each user has a &amp;quot;billing&amp;quot; parameter, which is extracted from the amount of resources that he/she asks.&lt;br /&gt;
&lt;br /&gt;
Therefore, when you ask for more memory, your billing is increasing. Eventually the billing parameter affects priority, so users who asked for less resources in the past will have more priority in the future.&lt;br /&gt;
&lt;br /&gt;
==Preemption rule summary==&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1540</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1540"/>
		<updated>2025-11-09T07:57:25Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;We have chatgpt page, which explains it all in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
==QOS==&lt;br /&gt;
Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
==Billing==&lt;br /&gt;
Each user has a &amp;quot;billing&amp;quot; parameter, which is extracted from the amount of resources that he/she asks.&lt;br /&gt;
&lt;br /&gt;
Therefore, when you ask for more memory, your billing is increasing. Eventually the billing parameter affects priority, so users who asked for less resources in the past will have more priority in the future.&lt;br /&gt;
&lt;br /&gt;
==Preemption rule summary==&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1538</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1538"/>
		<updated>2025-10-23T06:45:01Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;We have chatgpt page, which explains it all in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Preemption rule summary:&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1537</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1537"/>
		<updated>2025-10-23T06:44:41Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;We have chatgpt page, which explains it all in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Preemption rule summary:&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1536</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1536"/>
		<updated>2025-10-23T06:44:09Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Accessing the System */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1535</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1535"/>
		<updated>2025-10-23T06:43:48Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Accessing the System */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&amp;#039;&amp;#039;&amp;#039;&amp;lt;big&amp;gt;Big text&amp;lt;/big&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1534</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1534"/>
		<updated>2025-10-23T06:43:06Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Accessing the System */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
* We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1533</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1533"/>
		<updated>2025-10-23T06:42:50Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Accessing the System */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
*** We have chatgpt page for the new qos configuration, please look in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general-shared-pool -A public-users_v2 --qos=public pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general-pool -A public-users_v2 --qos=public gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2       # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                    # qos type&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=public-users_v2     # Account name&lt;br /&gt;
#SBATCH --partition=power-general-shared-pool       # Partition name&lt;br /&gt;
#SBATCH --qos=public                  # qos type&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general-public-pool --account=public-users_v2 --qos=public --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general-pool   # Partition name&lt;br /&gt;
#SBATCH --qos=my_qos                   # qos type&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general-shared-pool -A public-users_v2 --qos=public --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion-pool -A gpu-relion-users_v2 --qos=owner --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition power-general-shared-pool&lt;br /&gt;
#SBATCH -A public-users_v2&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --qos=your_qos&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1532</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1532"/>
		<updated>2025-10-23T06:41:28Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;We have chatgpt page, which explains it all in [https://chatgpt.com/g/g-68be7f9acfb88191978615c1693e2cff-hpc-helper-toolkit HPC-helper-toolkit]&lt;br /&gt;
&lt;br /&gt;
Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Preemption rule summary:&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1529</id>
		<title>New slurm qos usage</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=New_slurm_qos_usage&amp;diff=1529"/>
		<updated>2025-10-23T05:49:07Z</updated>

		<summary type="html">&lt;p&gt;Dvory: Created page with &amp;quot;Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.  {| class=&amp;quot;wikitable&amp;quot; |+ QOS types for each pool |- ! QOS !! Purp...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Each partition (or “pool”) now has several QoS tiers that determine job priority and preemption behavior.&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|+ QOS types for each pool&lt;br /&gt;
|-&lt;br /&gt;
! QOS !! Purpose !! Preempts !! Can be preempted by&lt;br /&gt;
|-&lt;br /&gt;
| Share-type QoS (e.g. 0.125_48c_8g, 0.75_48c_8g) || For multi-owner pools; defines each owner’s guaranteed slice (CPU/GPU portion). || owner,public || --&lt;br /&gt;
|-&lt;br /&gt;
| owner|| Used on your lab’s pool to run above your guaranteed slice (higher than public). || public || share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-shared-pool) || Used on cluster-wide shared pools for friendly or opportunistic runs || -- || owner, share-type QoS&lt;br /&gt;
|-&lt;br /&gt;
| public (partition: power-general-public-pool) || Used on cluster-wide shared, little group of nodes, not preemptable || -- || --&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Preemption rule summary:&lt;br /&gt;
share-type QoS &amp;gt; owner &amp;gt; public&lt;br /&gt;
&lt;br /&gt;
This means:&lt;br /&gt;
&lt;br /&gt;
•	A share-type QoS job can preempt owner or public jobs on the same pool.&lt;br /&gt;
&lt;br /&gt;
•	An owner job can preempt public jobs.&lt;br /&gt;
&lt;br /&gt;
•	Public jobs cannot preempt any other jobs.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==How to Submit Jobs with the Correct QoS==&lt;br /&gt;
Below are examples of how to use the new QoS tiers with your account:&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Owner QoS (on your lab’s pool)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p UIDHERE-pool --qos=owner --time=02:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Share-type QoS (on a multi-owner pool, for your guaranteed slice)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p gpu-dudu-tzach-yoav-pool --qos=0.125_48c_8g --gres=gpu:A100:1 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Public QoS (friendly, cluster-wide)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-shared-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For the small, protected CPU pool&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch -A UIDHERE-users_v2 -p power-general-public-pool --qos=public --time=01:00:00 run.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
===Handy Checks During Usage===&lt;br /&gt;
You can monitor your jobs and see their QoS and reasons:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me -O &amp;quot;JOBID,ACCOUNT,PARTITION,QOS,STATE,REASON&amp;quot;&lt;br /&gt;
sprio -w&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If your job was preempted, check:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;jobid&amp;gt; --format=JobID,State,Reason&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1528</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1528"/>
		<updated>2025-10-23T05:48:36Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Welcome to HPC Guide.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
[[Linux basic commands]]&lt;br /&gt;
&lt;br /&gt;
[[Public queues]]&lt;br /&gt;
&lt;br /&gt;
[[ New slurm qos usage]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a queue]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a slurm queue]]&lt;br /&gt;
&lt;br /&gt;
[[PBS-To-SLURM]]&lt;br /&gt;
&lt;br /&gt;
[[Creating and using conda environment]]&lt;br /&gt;
&lt;br /&gt;
[[Palo Alto VPN for linux]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold3]]&lt;br /&gt;
&lt;br /&gt;
[[Using GPU]]&lt;br /&gt;
&lt;br /&gt;
[[security installations]]&lt;br /&gt;
&lt;br /&gt;
[[Install matlab on work station per matlab user]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting vscode job on slurm]]&lt;br /&gt;
&lt;br /&gt;
[[Storage and scratch]]&lt;br /&gt;
&lt;br /&gt;
This HPC Tutorial is designed for researchers at TAU who are in need of computational power (computer resources) and wish to explore and use our High Performance Computing (HPC) core facilities. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The audience may be completely unaware of the HPC concepts but must have some basic understanding of computers and computer programming.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;What is HPC?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
“High Performance Computing” (HPC) is computing on a “Supercomputer”, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
a computer at the front line of contemporary processing capacity – particularly speed of calculation and available memory.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
A computer cluster consists of a set of loosely or tightly connected computers that work together so that in many respects they can be viewed as a single system.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The components of a cluster are usually connected to each other through fast local area networks(“LAN”) with each node (computer used as a server) running its own instance of an operating system. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low cost microprocessors, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
high-speed networks, and software for high performance distributed computing.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Compute clusters are usually deployed to improve performance and availability over that of a single computer, while typically being more cost-effective than single computers of comparable speed or availability.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1527</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1527"/>
		<updated>2025-09-11T08:02:10Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Welcome to HPC Guide.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
[[Linux basic commands]]&lt;br /&gt;
&lt;br /&gt;
[[Public queues]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a queue]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a slurm queue]]&lt;br /&gt;
&lt;br /&gt;
[[PBS-To-SLURM]]&lt;br /&gt;
&lt;br /&gt;
[[Creating and using conda environment]]&lt;br /&gt;
&lt;br /&gt;
[[Palo Alto VPN for linux]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold3]]&lt;br /&gt;
&lt;br /&gt;
[[Using GPU]]&lt;br /&gt;
&lt;br /&gt;
[[security installations]]&lt;br /&gt;
&lt;br /&gt;
[[Install matlab on work station per matlab user]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting vscode job on slurm]]&lt;br /&gt;
&lt;br /&gt;
[[Storage and scratch]]&lt;br /&gt;
&lt;br /&gt;
This HPC Tutorial is designed for researchers at TAU who are in need of computational power (computer resources) and wish to explore and use our High Performance Computing (HPC) core facilities. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The audience may be completely unaware of the HPC concepts but must have some basic understanding of computers and computer programming.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;What is HPC?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
“High Performance Computing” (HPC) is computing on a “Supercomputer”, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
a computer at the front line of contemporary processing capacity – particularly speed of calculation and available memory.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
A computer cluster consists of a set of loosely or tightly connected computers that work together so that in many respects they can be viewed as a single system.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The components of a cluster are usually connected to each other through fast local area networks(“LAN”) with each node (computer used as a server) running its own instance of an operating system. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low cost microprocessors, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
high-speed networks, and software for high performance distributed computing.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Compute clusters are usually deployed to improve performance and availability over that of a single computer, while typically being more cost-effective than single computers of comparable speed or availability.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Creating_and_using_conda_environment&amp;diff=1526</id>
		<title>Creating and using conda environment</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Creating_and_using_conda_environment&amp;diff=1526"/>
		<updated>2025-09-11T08:01:55Z</updated>

		<summary type="html">&lt;p&gt;Dvory: Created page with &amp;quot;In order to use conda or mamba, please load module miniconda/miniconda3-4.7.12-environmentally, using command: &amp;lt;pre&amp;gt; module load miniconda/miniconda3-4.7.12-environmentally &amp;lt;/...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In order to use conda or mamba, please load module miniconda/miniconda3-4.7.12-environmentally, using command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load miniconda/miniconda3-4.7.12-environmentally&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Or&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mamba/mamba-1.5.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To see all available environment you may type the below command. Maybe there is already an environment for what you need&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda env list&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And to load an environment, please perform:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate &amp;lt;module path&amp;gt;/envs/&amp;lt;MODULENAME&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
E.g.:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate /powerapps/share/centos7/miniconda/miniconda3-4.7.12-environmentally/envs/jupyter&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
If you would like to have your own environment within this conda, and to be able to install there something, please perform the following right after loading the environmentally module:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda create --prefix &amp;lt;path in your space&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
E.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda create --prefix /a/home/cc/staff/dvory/envs/new_env&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
Then activate it with &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate /a/home/cc/staff/dvory/envs/new_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To install something in your conda environment&lt;br /&gt;
&lt;br /&gt;
First, need to make sure that your cache is defined in a writable path, e.g. your home directory. So the following needs to be defined:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CONDA_PKGS_DIRS=$HOME/.conda/pkgs&lt;br /&gt;
export CONDA_ENVS_DIRS=$HOME/.conda/envs&lt;br /&gt;
export MAMBA_ROOT_PREFIX=$HOME/.mamba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you may install whatever you need&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install &amp;lt;module&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To deactivate an environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To unload a module:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module unload &amp;lt;module name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1525</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1525"/>
		<updated>2025-09-11T08:01:28Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Welcome to HPC Guide.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
[[Linux basic commands]]&lt;br /&gt;
&lt;br /&gt;
[[Public queues]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a queue]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a slurm queue]]&lt;br /&gt;
&lt;br /&gt;
[[PBS-To-SLURM]]&lt;br /&gt;
&lt;br /&gt;
[[Creaing and using conda environment]]&lt;br /&gt;
[[Creating and using conda environment]]&lt;br /&gt;
&lt;br /&gt;
[[Palo Alto VPN for linux]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold3]]&lt;br /&gt;
&lt;br /&gt;
[[Using GPU]]&lt;br /&gt;
&lt;br /&gt;
[[security installations]]&lt;br /&gt;
&lt;br /&gt;
[[Install matlab on work station per matlab user]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting vscode job on slurm]]&lt;br /&gt;
&lt;br /&gt;
[[Storage and scratch]]&lt;br /&gt;
&lt;br /&gt;
This HPC Tutorial is designed for researchers at TAU who are in need of computational power (computer resources) and wish to explore and use our High Performance Computing (HPC) core facilities. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The audience may be completely unaware of the HPC concepts but must have some basic understanding of computers and computer programming.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;What is HPC?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
“High Performance Computing” (HPC) is computing on a “Supercomputer”, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
a computer at the front line of contemporary processing capacity – particularly speed of calculation and available memory.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
A computer cluster consists of a set of loosely or tightly connected computers that work together so that in many respects they can be viewed as a single system.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The components of a cluster are usually connected to each other through fast local area networks(“LAN”) with each node (computer used as a server) running its own instance of an operating system. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low cost microprocessors, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
high-speed networks, and software for high performance distributed computing.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Compute clusters are usually deployed to improve performance and availability over that of a single computer, while typically being more cost-effective than single computers of comparable speed or availability.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1524</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1524"/>
		<updated>2025-09-11T08:00:43Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Welcome to HPC Guide.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
[[Linux basic commands]]&lt;br /&gt;
&lt;br /&gt;
[[Public queues]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a queue]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a slurm queue]]&lt;br /&gt;
&lt;br /&gt;
[[PBS-To-SLURM]]&lt;br /&gt;
&lt;br /&gt;
[[Creating and using conda environment]]&lt;br /&gt;
&lt;br /&gt;
[[Palo Alto VPN for linux]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold3]]&lt;br /&gt;
&lt;br /&gt;
[[Using GPU]]&lt;br /&gt;
&lt;br /&gt;
[[security installations]]&lt;br /&gt;
&lt;br /&gt;
[[Install matlab on work station per matlab user]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting vscode job on slurm]]&lt;br /&gt;
&lt;br /&gt;
[[Storage and scratch]]&lt;br /&gt;
&lt;br /&gt;
This HPC Tutorial is designed for researchers at TAU who are in need of computational power (computer resources) and wish to explore and use our High Performance Computing (HPC) core facilities. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The audience may be completely unaware of the HPC concepts but must have some basic understanding of computers and computer programming.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;What is HPC?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
“High Performance Computing” (HPC) is computing on a “Supercomputer”, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
a computer at the front line of contemporary processing capacity – particularly speed of calculation and available memory.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
A computer cluster consists of a set of loosely or tightly connected computers that work together so that in many respects they can be viewed as a single system.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The components of a cluster are usually connected to each other through fast local area networks(“LAN”) with each node (computer used as a server) running its own instance of an operating system. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low cost microprocessors, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
high-speed networks, and software for high performance distributed computing.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Compute clusters are usually deployed to improve performance and availability over that of a single computer, while typically being more cost-effective than single computers of comparable speed or availability.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1522</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1522"/>
		<updated>2025-05-26T14:16:31Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Basic Script */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* slurmlogin.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Your connection will be automatically routed to one of the login nodes:&lt;br /&gt;
powerslurm-login, powerslurm-login2, or powerslurm-login3.&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@slurmlogin.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting Multiple Jobs ===&lt;br /&gt;
&lt;br /&gt;
If you need to submit many similar jobs (hundreds or more), you should use a **Slurm job array**. Submitting each job individually using separate `sbatch` commands places a heavy load on the scheduler, slowing down job processing across the cluster. Job arrays allow you to bundle many related jobs together as a single submission. This is more efficient and easier to manage.&lt;br /&gt;
&lt;br /&gt;
Each task in the array runs independently like a separate job, but the array is submitted as a single job ID for scheduling and tracking purposes.&lt;br /&gt;
You can customize the behavior of each task using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 100 tasks, each processing a different input file. The array reduces scheduler load and simplifies job tracking.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=power-general-users   # Account name&lt;br /&gt;
#SBATCH --partition=power-general       # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-100                   # Array range: 100 tasks&lt;br /&gt;
#SBATCH --output=array_job_%A_%a.out    # Output file: Job ID and array task ID&lt;br /&gt;
#SBATCH --error=array_job_%A_%a.err     # Error file: Job ID and array task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on node(s): $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# You can use $SLURM_ARRAY_TASK_ID to customize behavior per task&lt;br /&gt;
# ./my_program input_${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this example:&lt;br /&gt;
* The job array consists of 100 tasks.&lt;br /&gt;
* Each task runs the same script but with a different input file.&lt;br /&gt;
* You access the task ID using the environment variable &amp;lt;code&amp;gt;SLURM_ARRAY_TASK_ID&amp;lt;/code&amp;gt;.&lt;br /&gt;
* The output and error logs are separated per task using &amp;lt;code&amp;gt;%A&amp;lt;/code&amp;gt; (job ID) and &amp;lt;code&amp;gt;%a&amp;lt;/code&amp;gt; (array task ID).&lt;br /&gt;
&lt;br /&gt;
==== Script Example: Job Array with different parameters per task ====&lt;br /&gt;
&lt;br /&gt;
This script submits a job array with 3 tasks. Each task runs the same program with a different input file: `data1.txt`, `data2.txt`, and `data3.txt`.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=array_job            # Job name&lt;br /&gt;
#SBATCH --account=power-general-users   # Account name&lt;br /&gt;
#SBATCH --partition=power-general       # Partition name&lt;br /&gt;
#SBATCH --time=01:00:00                 # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                      # Number of tasks per array job&lt;br /&gt;
#SBATCH --nodes=1                       # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1               # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=2G                # Memory per CPU&lt;br /&gt;
#SBATCH --array=1-3                     # Run 3 tasks with IDs 1, 2, 3&lt;br /&gt;
#SBATCH --output=array_%A_%a.out        # Output file: Job ID and task ID&lt;br /&gt;
#SBATCH --error=array_%A_%a.err         # Error file: Job ID and task ID&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting SLURM array task&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Array Task ID: $SLURM_ARRAY_TASK_ID&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Each task runs the program with a different input file&lt;br /&gt;
./my_program data${SLURM_ARRAY_TASK_ID}.txt&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Task completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
===Writing Single SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
#SBATCH --mail-user=&amp;lt;your email&amp;gt;      # Your mail address to receive an email&lt;br /&gt;
#SBATCH --mail-type=END,FAIL          # The mail will be sent upon ending the script successfully or not&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition powers-general&lt;br /&gt;
#SBATCH -A power-general-users&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guides:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold3 AlphaFold3 Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_vscode_job_on_slurm&amp;diff=1518</id>
		<title>Submitting vscode job on slurm</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_vscode_job_on_slurm&amp;diff=1518"/>
		<updated>2025-04-03T12:31:38Z</updated>

		<summary type="html">&lt;p&gt;Dvory: Created page with &amp;quot;2 options hereby presented to use vscode  1. vscode via using an interactive job, and loading vscode module  2. vscode via ssh proxyjump  =vscode module= This method may be to...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;2 options hereby presented to use vscode&lt;br /&gt;
&lt;br /&gt;
1. vscode via using an interactive job, and loading vscode module&lt;br /&gt;
&lt;br /&gt;
2. vscode via ssh proxyjump&lt;br /&gt;
&lt;br /&gt;
=vscode module=&lt;br /&gt;
This method may be too slow for the users, as noted by user luxembourg, however possible.&lt;br /&gt;
&lt;br /&gt;
User needs to:&lt;br /&gt;
&lt;br /&gt;
Login to powerslurm with &amp;#039;-X&amp;#039;&lt;br /&gt;
Request for interactive job with GUI,&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --partition power-general --x11 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Then on the compute node need to load module &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load vscode/vscode-1.98.2&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Also need to define a cache dir, it may be done in one of the scratches, e.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export XDG_RUNTIME_DIR=/scratch200/dvory&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
So within the interactive job one may activate vscode by typing&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
code&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=vscode via ssh proxyjump=&lt;br /&gt;
&lt;br /&gt;
Both &amp;#039;&amp;#039;&amp;#039;vscode&amp;#039;&amp;#039;&amp;#039; and &amp;#039;&amp;#039;&amp;#039;cursor&amp;#039;&amp;#039;&amp;#039; behave similarly, and this approach is supposed to be good for both of them.&lt;br /&gt;
&lt;br /&gt;
With this setup, a windows machine will be able to activate vscode/cursor in a compute node in the cluster, having the same slurm environment variables as defined in a batch script.&lt;br /&gt;
&lt;br /&gt;
* Every user may configure this, no need for root&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==In powerslurm-login node==&lt;br /&gt;
===One time setup===&lt;br /&gt;
Install dropbear in userspace.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;dropbear&amp;#039;&amp;#039;&amp;#039; is used to launch a private SSH server inside the job&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd ~&lt;br /&gt;
mkdir -p dropbear/install&lt;br /&gt;
&lt;br /&gt;
# Build it from source&lt;br /&gt;
curl -LO https://matt.ucc.asn.au/dropbear/releases/dropbear-2022.83.tar.bz2&lt;br /&gt;
tar -xjf dropbear-2022.83.tar.bz2&lt;br /&gt;
cd dropbear-2022.83&lt;br /&gt;
./configure --prefix=$HOME/dropbear/install&lt;br /&gt;
make &amp;amp;&amp;amp; make install&lt;br /&gt;
&lt;br /&gt;
# Generate host key&lt;br /&gt;
$HOME/dropbear/install/bin/dropbearkey -t rsa -f $HOME/dropbear/install/server-key&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Build a script===&lt;br /&gt;
Name of the script may be for example vscode_slurm_job.sh&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=vscode&lt;br /&gt;
#SBATCH --partition=power-general&lt;br /&gt;
#SBATCH --cpus-per-task=8&lt;br /&gt;
#SBATCH --mem=64G&lt;br /&gt;
#SBATCH --output=slurm_vscode.log&lt;br /&gt;
&lt;br /&gt;
DROPBEAR=&amp;quot;$HOME/dropbear/install&amp;quot;&lt;br /&gt;
SLURM_ENV_FILE=&amp;quot;$HOME/.slurm-env.bash&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# 🧠 Derive unique port from SLURM_JOB_ID&lt;br /&gt;
PORT=$(( 40000 + SLURM_JOB_ID % 10000 ))&lt;br /&gt;
&lt;br /&gt;
# Save SLURM-related env vars to a file&lt;br /&gt;
env | awk -F= &amp;#039;$1~/^(SLURM|CUDA|NVIDIA_)/ { print &amp;quot;export &amp;quot; $0 }&amp;#039; &amp;gt; &amp;quot;$SLURM_ENV_FILE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Write VS Code shell script&lt;br /&gt;
cat &amp;lt;&amp;lt;EOF &amp;gt; &amp;quot;$HOME/.vscode-shell&amp;quot;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
source $SLURM_ENV_FILE&lt;br /&gt;
export XDG_RUNTIME_DIR=/tmp/\$USER-vscode-runtime&lt;br /&gt;
mkdir -p \$XDG_RUNTIME_DIR&lt;br /&gt;
chmod 700 \$XDG_RUNTIME_DIR&lt;br /&gt;
exec bash&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
chmod +x &amp;quot;$HOME/.vscode-shell&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# 📝 Write SSH connection instructions to a file&lt;br /&gt;
HOSTNAME=$(hostname -f)&lt;br /&gt;
echo &amp;quot;VS Code SSH setup info for this job:&amp;quot; &amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;--------------------------------------&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;HostName: $HOSTNAME&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;Port: $PORT&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;Username: $USER&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;RemoteCommand: ~/.vscode-shell&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;ProxyJump: $USER@powerslurm-login.tau.ac.il&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
echo &amp;quot;Suggested ~/.ssh/config entry:&amp;quot; &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
cat &amp;lt;&amp;lt;EOF &amp;gt;&amp;gt; $HOME/vscode-connection-info.txt&lt;br /&gt;
&lt;br /&gt;
Host slurm-vscode&lt;br /&gt;
  HostName $HOSTNAME&lt;br /&gt;
  User $USER&lt;br /&gt;
  Port $PORT&lt;br /&gt;
  ProxyJump $USER@powerslurm-login.tau.ac.il&lt;br /&gt;
  RemoteCommand ~/.vscode-shell&lt;br /&gt;
  IdentityFile ~/.ssh/your_key&lt;br /&gt;
  StrictHostKeyChecking no&lt;br /&gt;
  UserKnownHostsFile /dev/null&lt;br /&gt;
EOF&lt;br /&gt;
&lt;br /&gt;
# Cleanup on exit&lt;br /&gt;
cleanup() {&lt;br /&gt;
    rm -f &amp;quot;$SLURM_ENV_FILE&amp;quot;&lt;br /&gt;
}&lt;br /&gt;
trap &amp;#039;cleanup&amp;#039; SIGTERM SIGINT EXIT&lt;br /&gt;
&lt;br /&gt;
# Launch dropbear&lt;br /&gt;
$DROPBEAR/sbin/dropbear -r $DROPBEAR/server-key -F -E -w -s -p $PORT&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==In windows==&lt;br /&gt;
===Generate an SSH key===&lt;br /&gt;
Open PowerShell or Git Bash on Windows, and run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
ssh-keygen -t rsa -b 4096&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
    Accept the default path (C:\Users\YourName\.ssh\id_rsa)&lt;br /&gt;
&lt;br /&gt;
    Choose a passphrase or leave empty&lt;br /&gt;
&lt;br /&gt;
You now have:&lt;br /&gt;
&lt;br /&gt;
    Private key: C:\Users\YourName\.ssh\id_rsa ← Keep this secure!&lt;br /&gt;
&lt;br /&gt;
    Public key: C:\Users\YourName\.ssh\id_rsa.pub&lt;br /&gt;
&lt;br /&gt;
Copy the content of id_rsa.pub to powerslurm-login&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
vim ~/.ssh/authorized_keys&lt;br /&gt;
# Paste the key and save&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
This key may also be copied to the compute node that is allocated for the job&lt;br /&gt;
&lt;br /&gt;
===Create ~\.ssh\config file===&lt;br /&gt;
Set up ~/.ssh/config:&lt;br /&gt;
&lt;br /&gt;
Create or edit C:\Users\&amp;lt;YourUsername&amp;gt;\.ssh\config, containing:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host slurm-vscode&lt;br /&gt;
  HostName compute-0-xxx     # Replace with the compute node&lt;br /&gt;
  User dvory                 # Replace with actual username&lt;br /&gt;
  ProxyJump dvory@powerslurm-login.tau.ac.il&lt;br /&gt;
  Port 64321&lt;br /&gt;
  IdentityFile ~/.ssh/id_rsa      # Or correct private key&lt;br /&gt;
  StrictHostKeyChecking no&lt;br /&gt;
  UserKnownHostsFile /dev/null&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Set permissions of .ssh/config===&lt;br /&gt;
Need to open cmd, and then run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\id_rsa&amp;quot; /reset&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\id_rsa&amp;quot; /inheritance:r&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\id_rsa&amp;quot; /grant:r &amp;quot;%USERNAME%:F&amp;quot;&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\config&amp;quot; /reset&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\config&amp;quot; /inheritance:r&lt;br /&gt;
icacls &amp;quot;%USERPROFILE%\.ssh\config&amp;quot; /grant:r &amp;quot;%USERNAME%:F&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Download &amp;#039;&amp;#039;&amp;#039;VS Code&amp;#039;&amp;#039;&amp;#039; (https://code.visualstudio.com/) or &amp;#039;&amp;#039;&amp;#039;Cursor&amp;#039;&amp;#039;&amp;#039; (https://www.cursor.com/)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Open the application and install there, or make sure that thie application Remote-SSH is installed&lt;br /&gt;
&lt;br /&gt;
===Installing Remote-SSH extension===&lt;br /&gt;
Launch your VS Code / Cursor app.&lt;br /&gt;
🔍 Open the Extensions Sidebar&lt;br /&gt;
&lt;br /&gt;
    Click the square icon on the left sidebar (or press Ctrl+Shift+X)&lt;br /&gt;
&lt;br /&gt;
    In the Search box, type:&lt;br /&gt;
&lt;br /&gt;
Remote - SSH&lt;br /&gt;
&lt;br /&gt;
📦 Install the Extension&lt;br /&gt;
&lt;br /&gt;
    Find: &amp;quot;Remote - SSH by Microsoft&amp;quot;&lt;br /&gt;
&lt;br /&gt;
    Click Install&lt;br /&gt;
&lt;br /&gt;
Remote-SSH Extension in Marketplace&lt;br /&gt;
&lt;br /&gt;
You’ll see a green install bar. Once done, you’ll see a new green &amp;quot;&amp;gt;&amp;lt;&amp;quot; icon in the bottom-left — that’s the remote connection control.&lt;br /&gt;
&lt;br /&gt;
===Test connection===&lt;br /&gt;
In VS Code or Cursor:&lt;br /&gt;
&lt;br /&gt;
    Hit F1 → “Remote-SSH: Connect to Host”&lt;br /&gt;
&lt;br /&gt;
    Select &amp;#039;&amp;#039;&amp;#039;slurm-vscode&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
You should get connected to the compute node inside the running Slurm job.&lt;br /&gt;
&lt;br /&gt;
==Running vscode==&lt;br /&gt;
===On powerslurm-login===&lt;br /&gt;
Submit the script:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sbatch vscode_slurm_job.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Check that the job is running, and find the allocated node:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
squeue --me&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The script that was submitted to run is supposed to generate a file named ~/vscode-connection-info.txt, containing all information needed for the proxyjump procedure.&lt;br /&gt;
&lt;br /&gt;
Check compute node and port when the job is running, by running:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cat ~/vscode-connection-info.txt&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You will see an output like:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
VS Code SSH setup info for this job:&lt;br /&gt;
--------------------------------------&lt;br /&gt;
HostName: compute-0-499.power5&lt;br /&gt;
Port: 43104&lt;br /&gt;
Username: dvory&lt;br /&gt;
RemoteCommand: ~/.vscode-shell&lt;br /&gt;
ProxyJump: dvory@powerslurm-login.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
Suggested ~/.ssh/config entry:&lt;br /&gt;
&lt;br /&gt;
Host slurm-vscode&lt;br /&gt;
  HostName compute-0-499.power5&lt;br /&gt;
  User dvory&lt;br /&gt;
  Port 43104&lt;br /&gt;
  ProxyJump dvory@powerslurm-login.tau.ac.il&lt;br /&gt;
  RemoteCommand ~/.vscode-shell&lt;br /&gt;
  IdentityFile ~/.ssh/your_key&lt;br /&gt;
  StrictHostKeyChecking no&lt;br /&gt;
  UserKnownHostsFile /dev/null&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And would need to change accordingly the ~/.ssh/config file on windows&lt;br /&gt;
&lt;br /&gt;
Let us assume it is running on node compute-0-499, port 43104 as was returned in the above example&lt;br /&gt;
===On windows===&lt;br /&gt;
Change file ~/.ssh/config according to the actual compute node that was allocated to the job&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Host slurm-vscode&lt;br /&gt;
  HostName compute-0-499     # Replace with the compute node&lt;br /&gt;
  User dvory                 # Replace with actual username&lt;br /&gt;
  ProxyJump dvory@powerslurm-login.tau.ac.il&lt;br /&gt;
  Port 43104                 # Replace with actual port&lt;br /&gt;
  IdentityFile ~/.ssh/id_rsa # Or correct private key&lt;br /&gt;
  StrictHostKeyChecking no&lt;br /&gt;
  UserKnownHostsFile /dev/null&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In vscode ot cursor&lt;br /&gt;
&lt;br /&gt;
Press F1 → type and select:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Remote-SSH: Connect to Host...&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Or click the green &amp;quot;&amp;gt;&amp;lt;&amp;quot; icon (bottom-left)&lt;br /&gt;
&lt;br /&gt;
Choose a host defined in your ~/.ssh/config (e.g., slurm-vscode)&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1517</id>
		<title>Main Page</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Main_Page&amp;diff=1517"/>
		<updated>2025-04-03T12:30:14Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Welcome to HPC Guide.&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
[[Linux basic commands]]&lt;br /&gt;
&lt;br /&gt;
[[Public queues]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a queue]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting a job to a slurm queue]]&lt;br /&gt;
&lt;br /&gt;
[[PBS-To-SLURM]]&lt;br /&gt;
&lt;br /&gt;
[[Creaing and using conda environment]]&lt;br /&gt;
&lt;br /&gt;
[[Palo Alto VPN for linux]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold]]&lt;br /&gt;
&lt;br /&gt;
[[Alphafold3]]&lt;br /&gt;
&lt;br /&gt;
[[Using GPU]]&lt;br /&gt;
&lt;br /&gt;
[[security installations]]&lt;br /&gt;
&lt;br /&gt;
[[Install matlab on work station per matlab user]]&lt;br /&gt;
&lt;br /&gt;
[[Submitting vscode job on slurm]]&lt;br /&gt;
&lt;br /&gt;
[[Storage and scratch]]&lt;br /&gt;
&lt;br /&gt;
This HPC Tutorial is designed for researchers at TAU who are in need of computational power (computer resources) and wish to explore and use our High Performance Computing (HPC) core facilities. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The audience may be completely unaware of the HPC concepts but must have some basic understanding of computers and computer programming.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;What is HPC?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
“High Performance Computing” (HPC) is computing on a “Supercomputer”, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
a computer at the front line of contemporary processing capacity – particularly speed of calculation and available memory.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
A computer cluster consists of a set of loosely or tightly connected computers that work together so that in many respects they can be viewed as a single system.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
The components of a cluster are usually connected to each other through fast local area networks(“LAN”) with each node (computer used as a server) running its own instance of an operating system. &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low cost microprocessors, &lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
high-speed networks, and software for high performance distributed computing.&lt;br /&gt;
&amp;lt;br&amp;gt;&lt;br /&gt;
Compute clusters are usually deployed to improve performance and availability over that of a single computer, while typically being more cost-effective than single computers of comparable speed or availability.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Creaing_and_using_conda_environment&amp;diff=1516</id>
		<title>Creaing and using conda environment</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Creaing_and_using_conda_environment&amp;diff=1516"/>
		<updated>2025-03-31T11:07:14Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;In order to use conda or mamba, please load module miniconda/miniconda3-4.7.12-environmentally, using command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load miniconda/miniconda3-4.7.12-environmentally&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Or&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module load mamba/mamba-1.5.8&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To see all available environment you may type the below command. Maybe there is already an environment for what you need&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda env list&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And to load an environment, please perform:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate &amp;lt;module path&amp;gt;/envs/&amp;lt;MODULENAME&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
E.g.:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate /powerapps/share/centos7/miniconda/miniconda3-4.7.12-environmentally/envs/jupyter&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
 &lt;br /&gt;
If you would like to have your own environment within this conda, and to be able to install there something, please perform the following right after loading the environmentally module:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda create --prefix &amp;lt;path in your space&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
E.g.&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda create --prefix /a/home/cc/staff/dvory/envs/new_env&lt;br /&gt;
&amp;lt;/pre&amp;gt; &lt;br /&gt;
Then activate it with &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda activate /a/home/cc/staff/dvory/envs/new_env&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
To install something in your conda environment&lt;br /&gt;
&lt;br /&gt;
First, need to make sure that your cache is defined in a writable path, e.g. your home directory. So the following needs to be defined:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export CONDA_PKGS_DIRS=$HOME/.conda/pkgs&lt;br /&gt;
export CONDA_ENVS_DIRS=$HOME/.conda/envs&lt;br /&gt;
export MAMBA_ROOT_PREFIX=$HOME/.mamba&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then you may install whatever you need&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda install &amp;lt;module&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To deactivate an environment:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
conda deactivate&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To unload a module:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
module unload &amp;lt;module name&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1507</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1507"/>
		<updated>2025-03-20T06:21:09Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Writing SLURM Job Scripts */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, need for now also to set slurm parameters inside the script, or within the interactive job:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
export SLURM_TASKS_PER_NODE=48&lt;br /&gt;
export SLURM_CPUS_ON_NODE=48&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition powers-general&lt;br /&gt;
#SBATCH -A power-general-users&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1506</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1506"/>
		<updated>2025-03-20T05:30:06Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Basic Script */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To ask for x cores interactively:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
srun --ntasks=1 --cpus-per-task=x  --partition=power-general --nodes=1 --pty bash&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition powers-general&lt;br /&gt;
#SBATCH -A power-general-users&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1505</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1505"/>
		<updated>2025-03-19T14:37:20Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition powers-general&lt;br /&gt;
#SBATCH -A power-general-users&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1504</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1504"/>
		<updated>2025-03-19T14:36:22Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==matlab==&lt;br /&gt;
==Running matlab example==&lt;br /&gt;
In this example there are 3 files:&lt;br /&gt;
&lt;br /&gt;
myTable.m ⇒ This matlab file calculates something&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039; a             b             c              d             \n&amp;#039;);&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
while 1&lt;br /&gt;
                for j = 1:10&lt;br /&gt;
                                a = sin(10*j);&lt;br /&gt;
                                b = a*cos(10*j);&lt;br /&gt;
                                c = a + b;&lt;br /&gt;
                                d = a - b;&lt;br /&gt;
                                fprintf(&amp;#039;%+6.5f   %+6.5f   %+6.5f   %+6.5f   \n&amp;#039;,a,b,c,d);&lt;br /&gt;
                end&lt;br /&gt;
end&lt;br /&gt;
fprintf(&amp;#039;=======================================\n&amp;#039;);&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
my_table_script.sh ⇒ This script executes the matlab program. Need just to run qsub with this script&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --mem=50mg&lt;br /&gt;
#SBATCH --partition powers-general&lt;br /&gt;
#SBATCH -A power-general-users&lt;br /&gt;
hostname&lt;br /&gt;
&lt;br /&gt;
cd /a/home/cc/tree/taucc/staff/dvory/matlab&lt;br /&gt;
&lt;br /&gt;
matlab -nodisplay -nosplash -nodesktop -r &amp;quot;run(myTable());exit;&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
run_in_loop.sh ⇒ However, one may also generate many jobs with this file&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
for i in {1..100}&lt;br /&gt;
&lt;br /&gt;
do&lt;br /&gt;
&lt;br /&gt;
        sbatch my_table_script.sh&lt;br /&gt;
&lt;br /&gt;
done&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Running my job is with the command (after doing chmod +x &amp;#039;run_in_loop.sh&amp;#039;):&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
./run_in_loop.sh&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1503</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1503"/>
		<updated>2025-02-11T08:14:24Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Basic Script */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1502</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1502"/>
		<updated>2025-02-11T08:13:44Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Writing SLURM Job Scripts */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
For defining an array, may add:&lt;br /&gt;
#SBATCH --array=1-300&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1501</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1501"/>
		<updated>2025-01-14T06:39:24Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Writing SLURM Job Scripts */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
For excluding a node, one may add the following&lt;br /&gt;
&amp;lt;syntaxhighlight&amp;gt;&lt;br /&gt;
#SBATCH --exclude=compute-0-[100-103],compute-0-67&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1500</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1500"/>
		<updated>2024-12-26T09:17:08Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Script for 1 GPU */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                      # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1499</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1499"/>
		<updated>2024-12-26T09:16:34Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Basic Script */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --nodes=1                     # Number of nodes&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Importance of Correct RAM Usage in Jobs===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. &lt;br /&gt;
&lt;br /&gt;
Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Why Correct RAM Usage Matters ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Resource Efficiency&amp;#039;&amp;#039;&amp;#039;: Allocating the right amount of memory helps in optimal resource utilization, allowing more jobs to run simultaneously on the cluster.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Job Stability&amp;#039;&amp;#039;&amp;#039;: Underestimating memory requirements can lead to OOM errors, causing your job to fail and waste computational resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Performance&amp;#039;&amp;#039;&amp;#039;: Overestimating memory needs can lead to underutilization of resources, potentially delaying other jobs in the queue.&lt;br /&gt;
&lt;br /&gt;
==== How to Specify Memory in SLURM ====&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem&amp;#039;&amp;#039;&amp;#039;: Specifies the total memory required for the job.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;--mem-per-cpu&amp;#039;&amp;#039;&amp;#039;: Specifies the memory required per CPU.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#SBATCH --mem=4G              # Total memory for the job&lt;br /&gt;
#SBATCH --mem-per-cpu=2G      # Memory per CPU&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
=== Common Errors ===&lt;br /&gt;
&lt;br /&gt;
# &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;  &amp;lt;br /&amp;gt;&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:  &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-general /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
# Job failed, and upon doing scontrol show job job_id or when running sacct -j job_id -o JobID,JobName,State%20  &amp;lt;br /&amp;gt;you see:   &amp;lt;code&amp;gt;JobState=OUT_OF_MEMORY Reason=OutOfMemory&amp;lt;/code&amp;gt;  or :&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
JobID           JobName                State &lt;br /&gt;
------------ ---------- -------------------- &lt;br /&gt;
71             oom_test        OUT_OF_MEMORY &lt;br /&gt;
71.batch          batch        OUT_OF_MEMORY &lt;br /&gt;
71.extern        extern            COMPLETED &lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;it means that the ram requested for the job was not enough, please resubmit the job again with more ram. see [https://wikihpc.tau.ac.il/index.php?title=Slurm_user_guide#Estimating_RAM_Usage below] for help with understanding how much ram your job may need.&lt;br /&gt;
&lt;br /&gt;
=== Chain Jobs ===&lt;br /&gt;
Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Always Specify Resources ===&lt;br /&gt;
When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
=== Attaching to Running Jobs ===&lt;br /&gt;
If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
=== Estimating RAM Usage ===&lt;br /&gt;
&lt;br /&gt;
When writing SLURM job scripts, it&amp;#039;s crucial to understand and correctly specify the memory requirements for your job. Proper memory allocation ensures efficient resource usage and prevents job failures due to out-of-memory (OOM) errors.&lt;br /&gt;
&lt;br /&gt;
==== Tips for Estimating RAM Usage ====&lt;br /&gt;
&lt;br /&gt;
* Check Application Documentation: Refer to the official documentation or user guides for memory-related information.&lt;br /&gt;
* Run a Small Test Job: Submit a smaller version of your job and monitor its memory usage using commands like `free -m`, `top`, or `htop`.&lt;br /&gt;
* Use Profiling Tools: Tools like `valgrind`, `gprof`, or built-in profilers can help you understand memory usage.&lt;br /&gt;
* Analyze Previous Jobs: Review SLURM logs and job statistics for insights into memory consumption of past jobs.&lt;br /&gt;
* Consult with Peers or Experts: Ask colleagues or experts who have experience with similar workloads.&lt;br /&gt;
&lt;br /&gt;
==== Example: Monitoring Memory Usage ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=memory_test&lt;br /&gt;
#SBATCH --account=your_account&lt;br /&gt;
#SBATCH --partition=your_partition&lt;br /&gt;
#SBATCH --time=01:00:00&lt;br /&gt;
#SBATCH --ntasks=1&lt;br /&gt;
#SBATCH --cpus-per-task=1&lt;br /&gt;
#SBATCH --mem=4G&lt;br /&gt;
#SBATCH --output=memory_test.out&lt;br /&gt;
#SBATCH --error=memory_test.err&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage&lt;br /&gt;
echo &amp;quot;Memory usage before running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./your_application&lt;br /&gt;
&lt;br /&gt;
# Monitor memory usage after running the job&lt;br /&gt;
echo &amp;quot;Memory usage after running the job:&amp;quot;&lt;br /&gt;
free -m&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== General Tips ====&lt;br /&gt;
&lt;br /&gt;
* Start Small: Begin with a conservative memory request and increase it based on observed usage.&lt;br /&gt;
* Consider Peak Usage: Plan for peak memory usage to avoid OOM errors.&lt;br /&gt;
* Use SLURM&amp;#039;s Memory Reporting: Use `sacct` to view memory usage statistics.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sacct -j &amp;lt;job_id&amp;gt; --format=JobID,JobName,MaxRSS,Elapsed&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1479</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1479"/>
		<updated>2024-09-22T13:49:07Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Basic Job Submission Commands */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Account&amp;#039;&amp;#039;&amp;#039; is the group of students you belong to (PI)&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Partition&amp;#039;&amp;#039;&amp;#039; is the group of nodes you belong to&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Common Error:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
srun: error: Unable to allocate resources: No partition specified or system default partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
srun --pty -c 1 --mem=2G -p power-general /bin/bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Chain Jobs:&amp;#039;&amp;#039;&amp;#039; Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Always Specify Resources:&amp;#039;&amp;#039;&amp;#039; When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Attaching to Running Jobs:&amp;#039;&amp;#039;&amp;#039; If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
This guide provides the essentials for new users to get started with SLURM. For more complex tasks, refer to the full SLURM documentation or contact your system administrator.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1478</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1478"/>
		<updated>2024-09-19T10:42:12Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Accessing the System ==&lt;br /&gt;
&lt;br /&gt;
To submit jobs to SLURM at Tel Aviv University, you need to access the system through one of the following login nodes:&lt;br /&gt;
&lt;br /&gt;
* powerslurm-login.tau.ac.il&lt;br /&gt;
* powerslurm-login2.tau.ac.il&lt;br /&gt;
&lt;br /&gt;
=== Requirements for Access ===&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Group Membership&amp;#039;&amp;#039;&amp;#039;: You must be part of the &amp;quot;power&amp;quot; group to access the resources.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;University Credentials&amp;#039;&amp;#039;&amp;#039;: Use your Tel Aviv University username and password to log in.&lt;br /&gt;
&lt;br /&gt;
These login nodes are your starting point for submitting jobs, checking job status, and managing your SLURM tasks.&lt;br /&gt;
&lt;br /&gt;
=== SSH Example ===&lt;br /&gt;
&lt;br /&gt;
To access the system using SSH, use the following example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you want to connect to the second login node, use:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; with your actual Tel Aviv University username&lt;br /&gt;
ssh your_username@powerslurm-login2.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
If you have an SSH key set up for password-less login, you can specify it like this:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# Replace &amp;#039;your_username&amp;#039; and &amp;#039;/path/to/your/private_key&amp;#039; accordingly&lt;br /&gt;
ssh -i /path/to/your/private_key your_username@powerslurm-login.tau.ac.il&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Environment Modules ==&lt;br /&gt;
&lt;br /&gt;
Environment Modules in SLURM allow users to dynamically modify their shell environment, providing an easy way to load and unload different software applications, libraries, and their dependencies. This system helps avoid conflicts between software versions and ensures the correct environment for running specific applications.&lt;br /&gt;
&lt;br /&gt;
Here are some common commands to work with environment modules:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#List Available Modules: To see all the modules available on the system, use:&lt;br /&gt;
module avail&lt;br /&gt;
&lt;br /&gt;
#To search for a specific module by name (e.g., `gcc`), use:&lt;br /&gt;
module avail gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Get Detailed Information About a Module: The `module spider` command provides detailed information about a module, including versions, dependencies, and descriptions:&lt;br /&gt;
module spider gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#View Module Settings: To see what environment variables and settings will be modified by a module, use:&lt;br /&gt;
module show gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Load a Module: To set up the environment for a specific software, use the `module load` command. For example, to load GCC version 12.1.0:&lt;br /&gt;
module load gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#List Loaded Modules: To view all currently loaded modules in your session, use:&lt;br /&gt;
module list&lt;br /&gt;
&lt;br /&gt;
#Unload a Module: To unload a specific module from your environment, use:&lt;br /&gt;
module unload gcc/gcc-12.1.0&lt;br /&gt;
&lt;br /&gt;
#Unload All Modules:** If you need to clear your environment of all loaded modules, use:&lt;br /&gt;
module purge&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;By using these commands, you can easily manage the software environments needed for different tasks, ensuring compatibility and reducing potential conflicts between software versions.&lt;br /&gt;
&lt;br /&gt;
== Basic Job Submission Commands ==&lt;br /&gt;
&lt;br /&gt;
=== Finding Your Account and Partition ===&lt;br /&gt;
&lt;br /&gt;
Before submitting a job, you need to know which partitions you have permission to use.&lt;br /&gt;
&lt;br /&gt;
Run the command `&amp;lt;code&amp;gt;check_my_partitions&amp;lt;/code&amp;gt;` to view a list of all the partitions you have permission to send jobs to.&lt;br /&gt;
&lt;br /&gt;
== Submitting Jobs==&lt;br /&gt;
sbatch: Submits a job script for batch processing.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example&amp;#039;&amp;#039;&amp;#039;:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
    sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&lt;br /&gt;
   # This command submits pre_process.bash to the power-general partition for 10 minutes. &lt;br /&gt;
   # With 1 GPU:&lt;br /&gt;
    sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Writing SLURM Job Scripts===&lt;br /&gt;
&lt;br /&gt;
Here is a simple job script example:&lt;br /&gt;
&lt;br /&gt;
==== Basic Script====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Max run time (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=1                    # Number of tasks&lt;br /&gt;
#SBATCH --cpus-per-task=1             # CPUs per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Error file&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your application commands go here&lt;br /&gt;
# ./my_program&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script for 1 GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=gpu_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account           # Account name&lt;br /&gt;
#SBATCH --partition=gpu-general        # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00                # Max run time&lt;br /&gt;
#SBATCH --ntasks=1                     # Number of tasks&lt;br /&gt;
#SBATCH --cpus-per-task=1              # CPUs per task&lt;br /&gt;
#SBATCH --gres=gpu:1                   # Number of GPUs&lt;br /&gt;
#SBATCH --mem-per-cpu=4G               # Memory per CPU&lt;br /&gt;
#SBATCH --output=my_job_%j.out         # Output file&lt;br /&gt;
#SBATCH --error=my_job_%j.err          # Error file&lt;br /&gt;
&lt;br /&gt;
module load python/python-3.8&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Starting GPU job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Your GPU commands go here&lt;br /&gt;
&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
===Interactive Jobs===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --pty bash&lt;br /&gt;
&lt;br /&gt;
#Specify a compute node:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&lt;br /&gt;
&lt;br /&gt;
#Using GUI:&lt;br /&gt;
srun --ntasks=1 -p power-general -A power-general-users --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Submitting RELION Jobs===&lt;br /&gt;
&lt;br /&gt;
To submit a RELION job interactively on the &amp;lt;code&amp;gt;gpu-relion&amp;lt;/code&amp;gt; queue with X11 forwarding, use the following steps:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#Start an interactive session with X11:&lt;br /&gt;
srun --ntasks=1 -p gpu-relion -A your_account --x11 --pty bash&lt;br /&gt;
#Load the RELION module:&lt;br /&gt;
module load relion/relion-4.0.1&lt;br /&gt;
#Launch RELION:&lt;br /&gt;
relion&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==AlphaFold==&lt;br /&gt;
&lt;br /&gt;
AlphaFold is a deep learning tool designed for predicting protein structures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Guide:&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
[https://hpcguide.tau.ac.il/index.php?title=Alphafold AlphaFold Guide]&lt;br /&gt;
&lt;br /&gt;
==Common SLURM Commands==&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#View all queues (partitions):&lt;br /&gt;
sinfo&lt;br /&gt;
#View all jobs:&lt;br /&gt;
squeue&lt;br /&gt;
#View details of a specific job:&lt;br /&gt;
scontrol show job &amp;lt;job_number&amp;gt;&lt;br /&gt;
#Get information about partitions:&lt;br /&gt;
scontrol show partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Troubleshooting &amp;amp; Tips ==&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Common Error:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
srun: error: Unable to allocate resources: No partition specified or system default partition&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Solution:&amp;#039;&amp;#039;&amp;#039; Always specify a partition. Example:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
srun --pty -c 1 --mem=2G -p power-general /bin/bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Chain Jobs:&amp;#039;&amp;#039;&amp;#039; Use the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag to set job dependencies.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sbatch --ntasks=1 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Always Specify Resources:&amp;#039;&amp;#039;&amp;#039; When submitting jobs, ensure you include all required resources like partition, memory, and CPUs to avoid job failures.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Attaching to Running Jobs:&amp;#039;&amp;#039;&amp;#039; If you need to monitor or interact with a running job, use &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt;. This command allows you to attach to a job&amp;#039;s input, output, and error streams in real-time.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Example:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
To view job steps of a specific job, use the following command:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
scontrol show job &amp;lt;job_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Look for sections labeled &amp;quot;StepId&amp;quot; within the output. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;For specific job steps, use:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
sattach &amp;lt;job_id.step_id&amp;gt;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;sattach&amp;lt;/code&amp;gt; is particularly useful for interactive jobs, where you can provide input directly. For non-interactive jobs, it acts like &amp;lt;code&amp;gt;tail -f&amp;lt;/code&amp;gt;, allowing you to monitor the output stream.&lt;br /&gt;
&lt;br /&gt;
This guide provides the essentials for new users to get started with SLURM. For more complex tasks, refer to the full SLURM documentation or contact your system administrator.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Alphafold&amp;diff=1477</id>
		<title>Alphafold</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Alphafold&amp;diff=1477"/>
		<updated>2024-09-19T10:40:41Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Alphafold ==&lt;br /&gt;
AlphaFold is an artificial intelligence (AI) program developed by DeepMind (part of Alphabet/Google) that predicts protein structures.&lt;br /&gt;
&lt;br /&gt;
=== Databases ===&lt;br /&gt;
The necessary databases are mounted on nodes with GPUs and are located at `/alphafold_storage/alphafold_db`.&lt;br /&gt;
&lt;br /&gt;
=== Usage ===&lt;br /&gt;
To run AlphaFold, use the `run_alphafold.sh` script located at `/powerapps/share/centos7/alphafold/alphafold-2.3.1/run_alphafold.sh`.&lt;br /&gt;
&lt;br /&gt;
===== &amp;#039;&amp;#039;&amp;#039;Required Parameters&amp;#039;&amp;#039;&amp;#039;: =====&lt;br /&gt;
* `-d &amp;lt;data_dir&amp;gt;`: Path to the directory of supporting data.&lt;br /&gt;
* `-o &amp;lt;output_dir&amp;gt;`: Path to a directory that will store the results.&lt;br /&gt;
* `-f &amp;lt;fasta_paths&amp;gt;`: Path to FASTA files containing sequences. For multiple sequences in a file, it will fold as a multimer. To fold more sequences one after another, separate the files with a comma.&lt;br /&gt;
&lt;br /&gt;
* `-t &amp;lt;max_template_date&amp;gt;`: Maximum template release date to consider (ISO-8601 format, i.e., YYYY-MM-DD). This parameter helps in folding historical test sets.&lt;br /&gt;
&lt;br /&gt;
===== &amp;#039;&amp;#039;&amp;#039;Optional Parameters&amp;#039;&amp;#039;&amp;#039;: =====&lt;br /&gt;
* `-g &amp;lt;use_gpu&amp;gt;`: Enable NVIDIA runtime to run with GPUs (default: true).&lt;br /&gt;
* `-r &amp;lt;run_relax&amp;gt;`: Whether to run the final relaxation step on the predicted models (default: true).&lt;br /&gt;
* `-e &amp;lt;enable_gpu_relax&amp;gt;`: Run relax on GPU if GPU is enabled (default: true).&lt;br /&gt;
* `-n &amp;lt;openmm_threads&amp;gt;`: OpenMM threads (default: all available cores).&lt;br /&gt;
* `-a &amp;lt;gpu_devices&amp;gt;`: Comma-separated list of devices to pass to &amp;#039;CUDA_VISIBLE_DEVICES&amp;#039; (default: 0).&lt;br /&gt;
* `-m &amp;lt;model_preset&amp;gt;`: Choose preset model configuration: &amp;#039;monomer&amp;#039;, &amp;#039;monomer_casp14&amp;#039;, &amp;#039;monomer_ptm&amp;#039;, or &amp;#039;multimer&amp;#039; (default: &amp;#039;monomer&amp;#039;).&lt;br /&gt;
* `-c &amp;lt;db_preset&amp;gt;`: Choose preset MSA database configuration (&amp;#039;reduced_dbs&amp;#039; or &amp;#039;full_dbs&amp;#039;, default: &amp;#039;full_dbs&amp;#039;).&lt;br /&gt;
* `-p &amp;lt;use_precomputed_msas&amp;gt;`: Whether to read MSAs written to disk (default: &amp;#039;false&amp;#039;).&lt;br /&gt;
* `-l &amp;lt;num_multimer_predictions_per_model&amp;gt;`: Number of predictions per model when using `model_preset=multimer` (default: 5).&lt;br /&gt;
* `-b &amp;lt;benchmark&amp;gt;`: Run multiple JAX model evaluations to obtain a timing that excludes compilation time (default: &amp;#039;false&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
==== Example Slurm Script ====&lt;br /&gt;
This script demonstrates how to submit an AlphaFold job using SLURM:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=AlphaFold-Multimer     # Job name&lt;br /&gt;
#SBATCH --partition=gpu2                  # Specify GPU partition&lt;br /&gt;
#SBATCH --nodes=1                         # Number of nodes&lt;br /&gt;
#SBATCH --ntasks=1                        # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=4                 # Number of CPU cores per task&lt;br /&gt;
#SBATCH --gres=gpu:1                      # Request 1 GPU&lt;br /&gt;
#SBATCH --output=alphafold_%j.out         # Standard output (with job ID)&lt;br /&gt;
#SBATCH --error=alphafold_%j.err          # Standard error (with job ID)&lt;br /&gt;
&lt;br /&gt;
# Description: AlphaFold-Multimer (Non-Docker) with auto-GPU selection&lt;br /&gt;
&lt;br /&gt;
# Load the required module/environment&lt;br /&gt;
module load alphafold/alphafold_non_docker_2.3.1&lt;br /&gt;
&lt;br /&gt;
# Run the AlphaFold script&lt;br /&gt;
bash $ALPHAFOLD_SCRIPT_PATH/run_alphafold.sh -d $ALPHAFOLD_DB_PATH -o ~/output_dir -f $ALPHAFOLD_SCRIPT_PATH/examples/query.fasta -t $(date +%Y-%m-%d)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Important Notes ====&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Output Directory&amp;#039;&amp;#039;&amp;#039;: You can specify the output directory using the `-o` parameter to store the results. This directory can be anywhere you choose.&lt;br /&gt;
* The `-t` (max_template_date) parameter defines the maximum release date of templates to consider in the format `YYYY-MM-DD`. This is crucial when working with historical test sets, as it restricts the search for templates to those released on or before the specified date. You can use different dates depending on your requirements, such as the current date with `$(date +%Y-%m-%d)` or a specific historical date, like `-t 2021-12-31`.&lt;br /&gt;
&lt;br /&gt;
==== Additional Resources ====&lt;br /&gt;
* You can download the `dummy_test` folder for sample output from this [https://github.com/kalininalab/alphafold_non_docker The Github Repository].&lt;br /&gt;
* For sample data, you can use `/home/alphafold_folder/alphafold_multimer_non_docker/example/query.fasta` or provide your own data for queries.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Alphafold&amp;diff=1476</id>
		<title>Alphafold</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Alphafold&amp;diff=1476"/>
		<updated>2024-09-19T10:40:23Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== Alphafold ==&lt;br /&gt;
AlphaFold is an artificial intelligence (AI) program developed by DeepMind (part of Alphabet/Google) that predicts protein structures.&lt;br /&gt;
&lt;br /&gt;
=== Databases ===&lt;br /&gt;
The necessary databases are mounted on nodes with GPUs and are located at `/alphafold_storage/alphafold_db`.&lt;br /&gt;
&lt;br /&gt;
=== Usage ===&lt;br /&gt;
To run AlphaFold, use the `run_alphafold.sh` script located at `/powerapps/share/centos7/alphafold/alphafold-2.3.1/run_alphafold.sh`.&lt;br /&gt;
&lt;br /&gt;
==== Script Reference ====&lt;br /&gt;
&lt;br /&gt;
===== &amp;#039;&amp;#039;&amp;#039;Required Parameters&amp;#039;&amp;#039;&amp;#039;: =====&lt;br /&gt;
* `-d &amp;lt;data_dir&amp;gt;`: Path to the directory of supporting data.&lt;br /&gt;
* `-o &amp;lt;output_dir&amp;gt;`: Path to a directory that will store the results.&lt;br /&gt;
* `-f &amp;lt;fasta_paths&amp;gt;`: Path to FASTA files containing sequences. For multiple sequences in a file, it will fold as a multimer. To fold more sequences one after another, separate the files with a comma.&lt;br /&gt;
&lt;br /&gt;
* `-t &amp;lt;max_template_date&amp;gt;`: Maximum template release date to consider (ISO-8601 format, i.e., YYYY-MM-DD). This parameter helps in folding historical test sets.&lt;br /&gt;
&lt;br /&gt;
===== &amp;#039;&amp;#039;&amp;#039;Optional Parameters&amp;#039;&amp;#039;&amp;#039;: =====&lt;br /&gt;
* `-g &amp;lt;use_gpu&amp;gt;`: Enable NVIDIA runtime to run with GPUs (default: true).&lt;br /&gt;
* `-r &amp;lt;run_relax&amp;gt;`: Whether to run the final relaxation step on the predicted models (default: true).&lt;br /&gt;
* `-e &amp;lt;enable_gpu_relax&amp;gt;`: Run relax on GPU if GPU is enabled (default: true).&lt;br /&gt;
* `-n &amp;lt;openmm_threads&amp;gt;`: OpenMM threads (default: all available cores).&lt;br /&gt;
* `-a &amp;lt;gpu_devices&amp;gt;`: Comma-separated list of devices to pass to &amp;#039;CUDA_VISIBLE_DEVICES&amp;#039; (default: 0).&lt;br /&gt;
* `-m &amp;lt;model_preset&amp;gt;`: Choose preset model configuration: &amp;#039;monomer&amp;#039;, &amp;#039;monomer_casp14&amp;#039;, &amp;#039;monomer_ptm&amp;#039;, or &amp;#039;multimer&amp;#039; (default: &amp;#039;monomer&amp;#039;).&lt;br /&gt;
* `-c &amp;lt;db_preset&amp;gt;`: Choose preset MSA database configuration (&amp;#039;reduced_dbs&amp;#039; or &amp;#039;full_dbs&amp;#039;, default: &amp;#039;full_dbs&amp;#039;).&lt;br /&gt;
* `-p &amp;lt;use_precomputed_msas&amp;gt;`: Whether to read MSAs written to disk (default: &amp;#039;false&amp;#039;).&lt;br /&gt;
* `-l &amp;lt;num_multimer_predictions_per_model&amp;gt;`: Number of predictions per model when using `model_preset=multimer` (default: 5).&lt;br /&gt;
* `-b &amp;lt;benchmark&amp;gt;`: Run multiple JAX model evaluations to obtain a timing that excludes compilation time (default: &amp;#039;false&amp;#039;).&lt;br /&gt;
&lt;br /&gt;
==== Example Slurm Script ====&lt;br /&gt;
This script demonstrates how to submit an AlphaFold job using SLURM:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
#SBATCH --job-name=AlphaFold-Multimer     # Job name&lt;br /&gt;
#SBATCH --partition=gpu2                  # Specify GPU partition&lt;br /&gt;
#SBATCH --nodes=1                         # Number of nodes&lt;br /&gt;
#SBATCH --ntasks=1                        # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=4                 # Number of CPU cores per task&lt;br /&gt;
#SBATCH --gres=gpu:1                      # Request 1 GPU&lt;br /&gt;
#SBATCH --output=alphafold_%j.out         # Standard output (with job ID)&lt;br /&gt;
#SBATCH --error=alphafold_%j.err          # Standard error (with job ID)&lt;br /&gt;
&lt;br /&gt;
# Description: AlphaFold-Multimer (Non-Docker) with auto-GPU selection&lt;br /&gt;
&lt;br /&gt;
# Load the required module/environment&lt;br /&gt;
module load alphafold/alphafold_non_docker_2.3.1&lt;br /&gt;
&lt;br /&gt;
# Run the AlphaFold script&lt;br /&gt;
bash $ALPHAFOLD_SCRIPT_PATH/run_alphafold.sh -d $ALPHAFOLD_DB_PATH -o ~/output_dir -f $ALPHAFOLD_SCRIPT_PATH/examples/query.fasta -t $(date +%Y-%m-%d)&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Important Notes ====&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Output Directory&amp;#039;&amp;#039;&amp;#039;: You can specify the output directory using the `-o` parameter to store the results. This directory can be anywhere you choose.&lt;br /&gt;
* The `-t` (max_template_date) parameter defines the maximum release date of templates to consider in the format `YYYY-MM-DD`. This is crucial when working with historical test sets, as it restricts the search for templates to those released on or before the specified date. You can use different dates depending on your requirements, such as the current date with `$(date +%Y-%m-%d)` or a specific historical date, like `-t 2021-12-31`.&lt;br /&gt;
&lt;br /&gt;
==== Additional Resources ====&lt;br /&gt;
* You can download the `dummy_test` folder for sample output from this [https://github.com/kalininalab/alphafold_non_docker The Github Repository].&lt;br /&gt;
* For sample data, you can use `/home/alphafold_folder/alphafold_multimer_non_docker/example/query.fasta` or provide your own data for queries.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1473</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1473"/>
		<updated>2024-09-11T12:45:23Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Example for submitting jobs */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;SLURM (Simple Linux Utility for Resource Management) is a job scheduler used on many high-performance computing systems. It manages and allocates resources such as compute nodes and controls job execution.&lt;br /&gt;
&lt;br /&gt;
=== Accessing the System ===&lt;br /&gt;
To submit jobs to the SLURM scheduler at Tel Aviv University, you must access the system through one of the designated login nodes. These nodes act as the gateway for submitting and managing your SLURM jobs. The available login nodes are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login2.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Login Requirements: ====&lt;br /&gt;
&lt;br /&gt;
# Membership in the &amp;quot;power&amp;quot; group: Ensure you are a part of the &amp;quot;power&amp;quot; group which grants the necessary permissions for accessing the HPC resources.&lt;br /&gt;
# University Credentials: Log in using your Tel Aviv University credentials. This ensures secure access and that your job submissions are appropriately accounted for under your user profile.&lt;br /&gt;
&lt;br /&gt;
Remember, these login nodes are the initial point of contact for all your job management tasks, including job submission, monitoring, and other SLURM-related operations.&lt;br /&gt;
&lt;br /&gt;
=== Basic Job Submission Commands ===&lt;br /&gt;
====Finding your account and partition====&lt;br /&gt;
In order to submit jobs to slurm, one needs to know the accounts and partitions she belongs to. Each account may belong to one or more partitions.&lt;br /&gt;
&lt;br /&gt;
To see the account I belong to, please type:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacctmgr show associations where user=dvory format=Account%20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you know your partition, and would like to know which account you need to specify when using it, please do (on powerslurm-login)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p &amp;lt;partition&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p power-general&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Example for submitting jobs====&lt;br /&gt;
# sbatch: Submit a batch job script.&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* This submits &amp;lt;code&amp;gt;pre_process.bash&amp;lt;/code&amp;gt; with 1 task for 10 minutes.&lt;br /&gt;
#* Example of chaining jobs: &amp;lt;code&amp;gt;sbatch --ntasks=128 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* Example with GPU: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 --gres=gpu:2 -p gpu-general -A gpu-general-users pre_process.bash&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&amp;lt;/code&amp;gt; &lt;br /&gt;
# srun: Submit an interactive job with MPI (Message Passing Interface), often called a &amp;quot;job step.&amp;quot;&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;srun --ntasks=2 -p power-general -A power-general-users --label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
#* With MPI: &amp;lt;code&amp;gt;srun -intasks=2 -p power-general -A power-general-users --label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
# sattach: Attach stdin/out/err to an existing job or job step.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Job Examples ===&lt;br /&gt;
* Opening a bash shell: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users  --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Specifying compute nodes: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Using GUI: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users --x11 /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Script Examples: ===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name for billing&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
# module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script example with GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account          # Account name for billing&lt;br /&gt;
#SBATCH --partition=long              # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --gres=gpu:NUMBER_OF_GPUS     # number of GPU&amp;#039;s to use in the job&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Example with GUI&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
srun  --pty -c 1 --mem=2G -p &amp;lt;power-xxx&amp;gt; --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Error Handling ===&lt;br /&gt;
* On some clusters, specifying resources is necessary. Without them, the job may fail.&lt;br /&gt;
** Example error: &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;&lt;br /&gt;
** Correct usage: &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-yoren /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Be aware that specifying GPU resources is crucial for jobs&lt;br /&gt;
&lt;br /&gt;
=== SLURM Information Commands ===&lt;br /&gt;
&lt;br /&gt;
* sinfo: View all queues (partitions).&lt;br /&gt;
* squeue: View all jobs.&lt;br /&gt;
* scontrol show partition: View all partitions.&lt;br /&gt;
* scontrol show job &amp;lt;job_number&amp;gt;: View a job&amp;#039;s attributes.&lt;br /&gt;
&lt;br /&gt;
=== Tips for Managing SLURM Jobs ===&lt;br /&gt;
&lt;br /&gt;
* Chain jobs by using the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag in &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Use &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; for interactive jobs that require specific resources for a limited time.&lt;br /&gt;
* &amp;lt;code&amp;gt;srun&amp;lt;/code&amp;gt; is versatile for both interactive and batch jobs, especially with MPI.&lt;br /&gt;
* Always specify necessary resources in clusters where defaults are not set.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1472</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1472"/>
		<updated>2024-09-04T14:06:39Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Interactive Job Examples */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;SLURM (Simple Linux Utility for Resource Management) is a job scheduler used on many high-performance computing systems. It manages and allocates resources such as compute nodes and controls job execution.&lt;br /&gt;
&lt;br /&gt;
=== Accessing the System ===&lt;br /&gt;
To submit jobs to the SLURM scheduler at Tel Aviv University, you must access the system through one of the designated login nodes. These nodes act as the gateway for submitting and managing your SLURM jobs. The available login nodes are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login2.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Login Requirements: ====&lt;br /&gt;
&lt;br /&gt;
# Membership in the &amp;quot;power&amp;quot; group: Ensure you are a part of the &amp;quot;power&amp;quot; group which grants the necessary permissions for accessing the HPC resources.&lt;br /&gt;
# University Credentials: Log in using your Tel Aviv University credentials. This ensures secure access and that your job submissions are appropriately accounted for under your user profile.&lt;br /&gt;
&lt;br /&gt;
Remember, these login nodes are the initial point of contact for all your job management tasks, including job submission, monitoring, and other SLURM-related operations.&lt;br /&gt;
&lt;br /&gt;
=== Basic Job Submission Commands ===&lt;br /&gt;
====Finding your account and partition====&lt;br /&gt;
In order to submit jobs to slurm, one needs to know the accounts and partitions she belongs to. Each account may belong to one or more partitions.&lt;br /&gt;
&lt;br /&gt;
To see the account I belong to, please type:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacctmgr show associations where user=dvory format=Account%20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you know your partition, and would like to know which account you need to specify when using it, please do (on powerslurm-login)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p &amp;lt;partition&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p power-general&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Example for submitting jobs====&lt;br /&gt;
# sbatch: Submit a batch job script.&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* This submits &amp;lt;code&amp;gt;pre_process.bash&amp;lt;/code&amp;gt; with 1 task for 10 minutes.&lt;br /&gt;
#* Example of chaining jobs: &amp;lt;code&amp;gt;sbatch --ntasks=128 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* Example with GPU: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 --gres=gpu:2 -p gpu-general -A gpu-general-users pre_process.bash&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&amp;lt;/code&amp;gt; &lt;br /&gt;
# srun: Submit an interactive job with MPI (Message Passing Interface), often called a &amp;quot;job step.&amp;quot;&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;srun --ntasks=2 -p power-general -A power-general-users --label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
#* With MPI: &amp;lt;code&amp;gt;srun -intasks=2 -p power-general -A power-general-users--label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
# sattach: Attach stdin/out/err to an existing job or job step.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Job Examples ===&lt;br /&gt;
* Opening a bash shell: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users  --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Specifying compute nodes: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Using GUI: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users --x11 /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Script Examples: ===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name for billing&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
# module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script example with GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account          # Account name for billing&lt;br /&gt;
#SBATCH --partition=long              # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --gres=gpu:NUMBER_OF_GPUS     # number of GPU&amp;#039;s to use in the job&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Example with GUI&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
srun  --pty -c 1 --mem=2G -p &amp;lt;power-xxx&amp;gt; --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Error Handling ===&lt;br /&gt;
* On some clusters, specifying resources is necessary. Without them, the job may fail.&lt;br /&gt;
** Example error: &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;&lt;br /&gt;
** Correct usage: &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-yoren /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Be aware that specifying GPU resources is crucial for jobs&lt;br /&gt;
&lt;br /&gt;
=== SLURM Information Commands ===&lt;br /&gt;
&lt;br /&gt;
* sinfo: View all queues (partitions).&lt;br /&gt;
* squeue: View all jobs.&lt;br /&gt;
* scontrol show partition: View all partitions.&lt;br /&gt;
* scontrol show job &amp;lt;job_number&amp;gt;: View a job&amp;#039;s attributes.&lt;br /&gt;
&lt;br /&gt;
=== Tips for Managing SLURM Jobs ===&lt;br /&gt;
&lt;br /&gt;
* Chain jobs by using the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag in &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Use &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; for interactive jobs that require specific resources for a limited time.&lt;br /&gt;
* &amp;lt;code&amp;gt;srun&amp;lt;/code&amp;gt; is versatile for both interactive and batch jobs, especially with MPI.&lt;br /&gt;
* Always specify necessary resources in clusters where defaults are not set.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Palo_Alto_VPN_for_linux&amp;diff=1471</id>
		<title>Palo Alto VPN for linux</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Palo_Alto_VPN_for_linux&amp;diff=1471"/>
		<updated>2024-06-05T05:46:24Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Download */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;For security reason TelAviv University starts a VPN with double authentication standard.&lt;br /&gt;
&lt;br /&gt;
In order to do that users have to check/fill in their mobile phone at myTAU page&lt;br /&gt;
(https://mytau.tau.ac.il/GetResource.php) and enroll to the service.&lt;br /&gt;
Then you need install GoogleAuthenticator on you mobile device and register it at TAU.&lt;br /&gt;
&lt;br /&gt;
After that you may download and install PaloAlto GlobalProtect VPN client on your device (all&lt;br /&gt;
operation systems are supported: IOS, Android, Linux MAC and even Window)&lt;br /&gt;
&lt;br /&gt;
The steps:&lt;br /&gt;
==Enrollment==&lt;br /&gt;
Go to https://mytau.tau.ac.il/GetResource.php&lt;br /&gt;
&lt;br /&gt;
Choose the “1” then “2” :&lt;br /&gt;
&lt;br /&gt;
Then you will receive SMS with 2-minute code and enter it immediately to the filed:&lt;br /&gt;
Then you will be redirected to the QR code for GoogleAuthenticator account setup:&lt;br /&gt;
Scan it using your mobile Google Authenticator app using “+” on bottom right corner of mobile device&lt;br /&gt;
and enter the generated code from mobile GoogleAuthenticator to the field and press the green button.&lt;br /&gt;
&lt;br /&gt;
==Download==&lt;br /&gt;
Download and install VPN client, from the browser, go to:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/vpn/PanGPLinux-5.3.4-c5.tgz GlobalProtect-5.3.4]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/vpn/PanGPLinux-6.0.1-c6.tgz GlobalProtect-6.0.1]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/vpn/PanGPLinux-6.1.1-c4.tgz GlobalProtect-6.1.1]&lt;br /&gt;
&lt;br /&gt;
[https://hpcguide.tau.ac.il/vpn/PanGPLinux-6.2.0-c10.tgz GlobalProtect-6.2.0]&lt;br /&gt;
&lt;br /&gt;
Linux package should be extracted and installed appropriated version:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Debian/Ubuntu&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;dpkg -i GlobalProtect_UI_deb-6.0.1.1-6.deb&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Redhat/Centos&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;yum localinstall GlobalProtect_UI_rpm-6.0.1.1-6.rpm&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==Configure==&lt;br /&gt;
&lt;br /&gt;
[[File:Paloalto3.PNG|thumb|right]]&lt;br /&gt;
&lt;br /&gt;
Execute and configure VPN client on Linux (another OS are similar) :&lt;br /&gt;
&lt;br /&gt;
Open client by pressing on the relevant icon (&amp;quot;1&amp;quot; as in the picture on the right)&lt;br /&gt;
&lt;br /&gt;
And enter address &amp;#039;&amp;#039;&amp;#039;vpn.tau.ac.il&amp;#039;&amp;#039;&amp;#039; (&amp;quot;2&amp;quot; as in the picture on the right)&lt;br /&gt;
&lt;br /&gt;
==Errors==&lt;br /&gt;
===SSL Error===&lt;br /&gt;
On latest ubuntu version, ubuntu 22.04, after installing and configuring globalprotect VPN, you get this error:&lt;br /&gt;
&lt;br /&gt;
[[File:784px-Vpn ssl error.png|none|thumb]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
====Fix only for globalprotect====&lt;br /&gt;
create new &amp;lt;code&amp;gt;ssl.conf&amp;lt;/code&amp;gt; file on your pc with the following content:&lt;br /&gt;
vim ~/ssl.conf&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
openssl_conf = openssl_init&lt;br /&gt;
[openssl_init]&lt;br /&gt;
ssl_conf = ssl_sect&lt;br /&gt;
[ssl_sect]&lt;br /&gt;
system_default = system_default_sect&lt;br /&gt;
[system_default_sect]&lt;br /&gt;
Options = UnsafeLegacyRenegotiation&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
Then find this file:&lt;br /&gt;
&amp;lt;code&amp;gt;sudo find / -name PanGPUI.desktop -type f&amp;lt;/code&amp;gt;&lt;br /&gt;
or&lt;br /&gt;
&amp;lt;code&amp;gt;locate PanGPUI.desktop&amp;lt;/code&amp;gt; (may need to do sudo updatedb before running this one)&lt;br /&gt;
there should be at least 2 path with this file, ignore this one --&amp;gt; &amp;lt;code&amp;gt;/opt/paloaltonetworks/globalprotect/PanGPUI.desktop&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On my linux, kubuntu 22.04 the file is here: &amp;lt;code&amp;gt;/etc/xdg/autostart/PanGPUI.desktop&amp;lt;/code&amp;gt;&lt;br /&gt;
enter this file and change it from:&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
[Desktop Entry]&lt;br /&gt;
Name=PanGPUI&lt;br /&gt;
Type=Application&lt;br /&gt;
Exec=/opt/paloaltonetworks/globalprotect/PanGPUI&lt;br /&gt;
Terminal=false&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
to &lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
[Desktop Entry]&lt;br /&gt;
Name=PanGPUI&lt;br /&gt;
Type=Application&lt;br /&gt;
Exec=OPENSSL_CONF=~/ssl.conf /opt/paloaltonetworks/globalprotect/PanGPUI&lt;br /&gt;
Terminal=false&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
After restarting you pc, globalprotect will autostart with the custom ssl settings&lt;br /&gt;
&lt;br /&gt;
====Global fix====&lt;br /&gt;
here is how to workaround it:&lt;br /&gt;
&lt;br /&gt;
open  &amp;lt;code&amp;gt;/usr/lib/ssl/openssl.cnf&amp;lt;/code&amp;gt; &lt;br /&gt;
&lt;br /&gt;
comment out this section:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
# [openssl_init]&lt;br /&gt;
&lt;br /&gt;
# providers = provider_sect&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;add this new section under the commented one from earlier:&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
[openssl_init]&lt;br /&gt;
ssl_conf = ssl_sect&lt;br /&gt;
&lt;br /&gt;
[ssl_sect]&lt;br /&gt;
system_default = system_default_sect&lt;br /&gt;
&lt;br /&gt;
[system_default_sect]&lt;br /&gt;
Options = UnsafeLegacyRenegotiation&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;reboot globalprotect app and the error should be fixed.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;blockquote&amp;gt;source:https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1960268&amp;lt;/blockquote&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==TAU credentials==&lt;br /&gt;
[[File:Paloalto4.PNG|thumb|right]]&lt;br /&gt;
Fill in pop-upped windows with your TAU credentials:&lt;br /&gt;
&lt;br /&gt;
Open your mobile GoogleAuthenticator and enter code from there&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Congratulations: you are done!&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1470</id>
		<title>Submitting a job to a slurm queue</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Submitting_a_job_to_a_slurm_queue&amp;diff=1470"/>
		<updated>2024-05-08T06:40:19Z</updated>

		<summary type="html">&lt;p&gt;Dvory: /* Script example with GPU */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;SLURM (Simple Linux Utility for Resource Management) is a job scheduler used on many high-performance computing systems. It manages and allocates resources such as compute nodes and controls job execution.&lt;br /&gt;
&lt;br /&gt;
=== Accessing the System ===&lt;br /&gt;
To submit jobs to the SLURM scheduler at Tel Aviv University, you must access the system through one of the designated login nodes. These nodes act as the gateway for submitting and managing your SLURM jobs. The available login nodes are:&lt;br /&gt;
&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;powerslurm-login2.tau.ac.il&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Login Requirements: ====&lt;br /&gt;
&lt;br /&gt;
# Membership in the &amp;quot;power&amp;quot; group: Ensure you are a part of the &amp;quot;power&amp;quot; group which grants the necessary permissions for accessing the HPC resources.&lt;br /&gt;
# University Credentials: Log in using your Tel Aviv University credentials. This ensures secure access and that your job submissions are appropriately accounted for under your user profile.&lt;br /&gt;
&lt;br /&gt;
Remember, these login nodes are the initial point of contact for all your job management tasks, including job submission, monitoring, and other SLURM-related operations.&lt;br /&gt;
&lt;br /&gt;
=== Basic Job Submission Commands ===&lt;br /&gt;
====Finding your account and partition====&lt;br /&gt;
In order to submit jobs to slurm, one needs to know the accounts and partitions she belongs to. Each account may belong to one or more partitions.&lt;br /&gt;
&lt;br /&gt;
To see the account I belong to, please type:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sacctmgr show associations where user=dvory format=Account%20&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
If you know your partition, and would like to know which account you need to specify when using it, please do (on powerslurm-login)&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p &amp;lt;partition&amp;gt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
For example:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
check_allowed_account -p power-general&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
====Example for submitting jobs====&lt;br /&gt;
# sbatch: Submit a batch job script.&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 -p power-general -A power-general-users pre_process.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* This submits &amp;lt;code&amp;gt;pre_process.bash&amp;lt;/code&amp;gt; with 1 task for 10 minutes.&lt;br /&gt;
#* Example of chaining jobs: &amp;lt;code&amp;gt;sbatch --ntasks=128 --time=60 -p power-general -A power-general-users --depend=45001 do_work.bash&amp;lt;/code&amp;gt;&lt;br /&gt;
#* Example with GPU: &amp;lt;code&amp;gt;sbatch --ntasks=1 --time=10 --gres=gpu:2 -p gpu-general -A gpu-general-users pre_process.bash&amp;lt;/code&amp;gt; or &amp;lt;code&amp;gt;sbatch --gres=gpu:1 -p gpu-general -A gpu-general-users gpu_job.sh&amp;lt;/code&amp;gt; &lt;br /&gt;
# srun: Submit an interactive job with MPI (Message Passing Interface), often called a &amp;quot;job step.&amp;quot;&lt;br /&gt;
#* Example: &amp;lt;code&amp;gt;srun --ntasks=2 -p power-general -A power-general-users --label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
#* With MPI: &amp;lt;code&amp;gt;srun -intasks=2 -p power-general -A power-general-users--label hostname&amp;lt;/code&amp;gt;&lt;br /&gt;
# sattach: Attach stdin/out/err to an existing job or job step.&lt;br /&gt;
&lt;br /&gt;
=== Interactive Job Examples ===&lt;br /&gt;
* Opening a bash shell: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users  --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Specifying compute nodes: &amp;lt;code&amp;gt;srun --ntasks=56 -p power-general -A power-general-users --nodelist=&amp;quot;compute-0-12&amp;quot; --pty bash&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Script Examples: ===&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=power-general-users # Account name for billing&lt;br /&gt;
#SBATCH --partition=power-general     # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
# module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
==== Script example with GPU ====&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
#!/bin/bash&lt;br /&gt;
&lt;br /&gt;
#SBATCH --job-name=my_job             # Job name&lt;br /&gt;
#SBATCH --account=my_account          # Account name for billing&lt;br /&gt;
#SBATCH --partition=long              # Partition name&lt;br /&gt;
#SBATCH --time=02:00:00               # Time allotted for the job (hh:mm:ss)&lt;br /&gt;
#SBATCH --ntasks=4                    # Number of tasks (processes)&lt;br /&gt;
#SBATCH --cpus-per-task=1             # Number of CPU cores per task&lt;br /&gt;
#SBATCH --gres=gpu:NUMBER_OF_GPUS     # number of GPU&amp;#039;s to use in the job&lt;br /&gt;
#SBATCH --mem-per-cpu=4G              # Memory per CPU core&lt;br /&gt;
#SBATCH --output=my_job_%j.out        # Standard output and error log (%j expands to jobId)&lt;br /&gt;
#SBATCH --error=my_job_%j.err         # Separate file for standard error&lt;br /&gt;
&lt;br /&gt;
# Load modules or software if required&lt;br /&gt;
module load python/3.8&lt;br /&gt;
&lt;br /&gt;
# Print some information about the job&lt;br /&gt;
echo &amp;quot;Starting my SLURM job&amp;quot;&lt;br /&gt;
echo &amp;quot;Job ID: $SLURM_JOB_ID&amp;quot;&lt;br /&gt;
echo &amp;quot;Running on nodes: $SLURM_JOB_NODELIST&amp;quot;&lt;br /&gt;
echo &amp;quot;Allocated CPUs: $SLURM_JOB_CPUS_PER_NODE&amp;quot;&lt;br /&gt;
&lt;br /&gt;
# Run your application, this could be anything from a custom script to standard applications&lt;br /&gt;
# ./my_program&lt;br /&gt;
# python my_script.py&lt;br /&gt;
&lt;br /&gt;
# End of script&lt;br /&gt;
echo &amp;quot;Job completed&amp;quot;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Example with GUI&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;bash&amp;quot;&amp;gt;&lt;br /&gt;
&lt;br /&gt;
srun  --pty -c 1 --mem=2G -p &amp;lt;power-xxx&amp;gt; --x11 /bin/bash&lt;br /&gt;
&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== Error Handling ===&lt;br /&gt;
* On some clusters, specifying resources is necessary. Without them, the job may fail.&lt;br /&gt;
** Example error: &amp;lt;code&amp;gt;srun: error: Unable to allocate resources: No partition specified or system default partition&amp;lt;/code&amp;gt;&lt;br /&gt;
** Correct usage: &amp;lt;code&amp;gt;srun --pty -c 1 --mem=2G -p power-yoren /bin/bash&amp;lt;/code&amp;gt;&lt;br /&gt;
* Be aware that specifying GPU resources is crucial for jobs&lt;br /&gt;
&lt;br /&gt;
=== SLURM Information Commands ===&lt;br /&gt;
&lt;br /&gt;
* sinfo: View all queues (partitions).&lt;br /&gt;
* squeue: View all jobs.&lt;br /&gt;
* scontrol show partition: View all partitions.&lt;br /&gt;
* scontrol show job &amp;lt;job_number&amp;gt;: View a job&amp;#039;s attributes.&lt;br /&gt;
&lt;br /&gt;
=== Tips for Managing SLURM Jobs ===&lt;br /&gt;
&lt;br /&gt;
* Chain jobs by using the &amp;lt;code&amp;gt;--depend&amp;lt;/code&amp;gt; flag in &amp;lt;code&amp;gt;sbatch&amp;lt;/code&amp;gt;.&lt;br /&gt;
* Use &amp;lt;code&amp;gt;salloc&amp;lt;/code&amp;gt; for interactive jobs that require specific resources for a limited time.&lt;br /&gt;
* &amp;lt;code&amp;gt;srun&amp;lt;/code&amp;gt; is versatile for both interactive and batch jobs, especially with MPI.&lt;br /&gt;
* Always specify necessary resources in clusters where defaults are not set.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1469</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1469"/>
		<updated>2024-04-15T13:42:08Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. &amp;#039;&amp;#039;&amp;#039;for temporary usage&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red;&amp;quot;&amp;gt;&amp;#039;&amp;#039;&amp;#039;Scratches are not being backed up&amp;#039;&amp;#039;&amp;#039;&amp;lt;/span&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1468</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1468"/>
		<updated>2024-04-15T13:41:23Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red;&amp;quot;&amp;gt;&amp;#039;&amp;#039;&amp;#039;Scratches are not being backed up&amp;#039;&amp;#039;&amp;#039;&amp;lt;/span&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1467</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1467"/>
		<updated>2024-04-15T13:40:53Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;#039;&amp;#039;&amp;#039;Scratches are not being backed up&amp;#039;&amp;#039;&amp;#039;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;span style=&amp;quot;color:red;&amp;quot;&amp;gt;This text is red!&amp;lt;/span&amp;gt;&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1466</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1466"/>
		<updated>2024-04-15T13:39:49Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;code&amp;gt;&amp;#039;&amp;#039;&amp;#039;Scratches are not being backed up&amp;#039;&amp;#039;&amp;#039;&amp;lt;/code&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1465</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1465"/>
		<updated>2024-04-15T13:36:52Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
==== &amp;#039;&amp;#039;&amp;#039;Scratches are not being backed up&amp;#039;&amp;#039;&amp;#039; ====&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1464</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1464"/>
		<updated>2024-04-15T13:35:06Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
scratches are not being backed up.&lt;br /&gt;
&lt;br /&gt;
The user must make her own backup and not store there any important data&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1463</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1463"/>
		<updated>2024-04-15T13:17:17Z</updated>

		<summary type="html">&lt;p&gt;Dvory: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
&lt;br /&gt;
/scratch100&lt;br /&gt;
&lt;br /&gt;
/scratch200&lt;br /&gt;
&lt;br /&gt;
/scratch300&lt;br /&gt;
&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
&lt;br /&gt;
scratches are not being backed up.&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
	<entry>
		<id>https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1462</id>
		<title>Storage and scratch</title>
		<link rel="alternate" type="text/html" href="https://hpcguide.tau.ac.il/index.php?title=Storage_and_scratch&amp;diff=1462"/>
		<updated>2024-04-15T13:16:59Z</updated>

		<summary type="html">&lt;p&gt;Dvory: Created page with &amp;quot;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1 netapp st...&amp;quot;&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;Besides each user&amp;#039;s home directory, one may purchase disk storage, prices are according to https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1&lt;br /&gt;
netapp storage usually contains snapshots (described in https://view.monday.com/4073193937-33252df4e02cadb641ff891627342c96?r=use1)&lt;br /&gt;
The storage may also be backed up by legato server, as described in https://computing.tau.ac.il/infrastructure_backup&lt;br /&gt;
&lt;br /&gt;
Apart from the storage there are /scratch partitions:&lt;br /&gt;
/scratch100&lt;br /&gt;
/scratch200&lt;br /&gt;
/scratch300&lt;br /&gt;
Which may be used as scratch, i.e. for temporary usage.&lt;br /&gt;
scratches are not being backed up.&lt;br /&gt;
&lt;br /&gt;
In some work stations and compute nodes there is also a local /scratch folder, which is used in a similar way.&lt;/div&gt;</summary>
		<author><name>Dvory</name></author>
	</entry>
</feed>