Difference between revisions of "Alphafold"
Jump to navigation
Jump to search
orig>Levk |
orig>Levk |
||
Line 5: | Line 5: | ||
=== How to use=== | === How to use=== | ||
− | + | use <b>run_alphafold.sh</b> script located at /home/alphafold_folder/alphafold_multimer_non_docker | |
− | + | ||
+ | script reference: | ||
+ | <pre> | ||
+ | Usage: run_alphafold.sh <OPTIONS> | ||
+ | Required Parameters: | ||
+ | -d <data_dir> Path to directory with supporting data: AlphaFold parameters and genetic and template databases. Set to the target of download_all_databases.sh. | ||
+ | -o <output_dir> Path to a directory that will store the results. | ||
+ | -f <fasta_path> Path to a FASTA file containing a single sequence. | ||
+ | -t <max_template_date> Maximum template release date to consider (ISO-8601 format: YYYY-MM-DD). Important if folding historical test sets. | ||
+ | Optional Parameters: | ||
+ | -n <openmm_threads> OpenMM threads (default: all available cores) | ||
+ | -b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: false) | ||
+ | -g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true) | ||
+ | -a <gpu_devices> Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0) | ||
+ | -m <model_preset> Choose preset model configuration - the monomer model (monomer), the monomer model with extra ensembling (monomer_casp14), monomer model with pTM head (monomer_ptm), or multimer model (multimer) (default: monomer) | ||
+ | -p <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: full_dbs) | ||
+ | -u <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed. (default: false) | ||
+ | -r <remove_msas_after_use> Whether, after structure prediction(s), to delete MSAs that have been written to disk to significantly free up storage space. (default: false) | ||
+ | -i <is_prokaryote> Optional for multimer system, not used by the single chain system. This should contain a boolean specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. These values determine the pairing method for the MSA (default: false) | ||
+ | </pre> | ||
+ | |||
+ | ==== Databases ==== | ||
+ | We downloaded the databases to /home/alphafold_folder/alphafold_data on compute-0-300 | ||
+ | you may use it, or copy it to your own storage and point to it with -d flag of the run script. | ||
+ | also, you may download the databases to your own storage via the script <b>download_all_data.sh</b> located at /home/alphafold_folder/alphafold_multimer_non_docker/scripts/ | ||
+ | |||
+ | |||
+ | |||
+ | ==== Sample qsub script ==== | ||
<pre> | <pre> | ||
Line 12: | Line 40: | ||
#PBS -l select=1:ncpus=4:ngpus=1 | #PBS -l select=1:ncpus=4:ngpus=1 | ||
#PBS -q gpu | #PBS -q gpu | ||
− | + | # load miniconda | |
− | /powerapps/share/centos7/miniconda/miniconda3-4.7.12-environmentally/envs/alphafold_non_docker | + | module load miniconda/miniconda3-4.7.12-environmentally |
− | bash /home/alphafold_folder/alphafold_multimer_non_docker/run_alphafold.sh -d /tzachi_storage/evgenyf/alphafold/alphafold_data - | + | # activate relevant venv |
− | o /home/alphafold_folder/alphafold_multimer_non_docker/dummy_test/ -f /home/alphafold_folder/alphafold_multimer_non_docker/ | + | conda activate /powerapps/share/centos7/miniconda/miniconda3-4.7.12-environmentally/envs/alphafold_non_docker |
− | + | # run alphafold | |
+ | bash /home/alphafold_folder/alphafold_multimer_non_docker/run_alphafold.sh -d /tzachi_storage/evgenyf/alphafold/alphafold_data -o /home/alphafold_folder/alphafold_multimer_non_docker/dummy_test/ -f /home/alphafold_folder/alphafold_multimer_non_docker/example/query.fasta -t 2020-05-14 | ||
</pre> | </pre> | ||
− | [https://github.com/amorehead/alphafold_non_docker Alphafold - | + | [https://github.com/amorehead/alphafold_non_docker Alphafold - non_docker source] |
Revision as of 15:18, 22 February 2022
Alphafold
AlphaFold is an artificial intelligence (AI) program developed by Alphabets's/Google's DeepMind which performs predictions of protein structure.
How to use
use run_alphafold.sh script located at /home/alphafold_folder/alphafold_multimer_non_docker
script reference:
Usage: run_alphafold.sh <OPTIONS> Required Parameters: -d <data_dir> Path to directory with supporting data: AlphaFold parameters and genetic and template databases. Set to the target of download_all_databases.sh. -o <output_dir> Path to a directory that will store the results. -f <fasta_path> Path to a FASTA file containing a single sequence. -t <max_template_date> Maximum template release date to consider (ISO-8601 format: YYYY-MM-DD). Important if folding historical test sets. Optional Parameters: -n <openmm_threads> OpenMM threads (default: all available cores) -b <benchmark> Run multiple JAX model evaluations to obtain a timing that excludes the compilation time, which should be more indicative of the time required for inferencing many proteins (default: false) -g <use_gpu> Enable NVIDIA runtime to run with GPUs (default: true) -a <gpu_devices> Comma separated list of devices to pass to 'CUDA_VISIBLE_DEVICES' (default: 0) -m <model_preset> Choose preset model configuration - the monomer model (monomer), the monomer model with extra ensembling (monomer_casp14), monomer model with pTM head (monomer_ptm), or multimer model (multimer) (default: monomer) -p <db_preset> Choose preset MSA database configuration - smaller genetic database config (reduced_dbs) or full genetic database config (full_dbs) (default: full_dbs) -u <use_precomputed_msas> Whether to read MSAs that have been written to disk. WARNING: This will not check if the sequence, database or configuration have changed. (default: false) -r <remove_msas_after_use> Whether, after structure prediction(s), to delete MSAs that have been written to disk to significantly free up storage space. (default: false) -i <is_prokaryote> Optional for multimer system, not used by the single chain system. This should contain a boolean specifying true where the target complex is from a prokaryote, and false where it is not, or where the origin is unknown. These values determine the pairing method for the MSA (default: false)
Databases
We downloaded the databases to /home/alphafold_folder/alphafold_data on compute-0-300 you may use it, or copy it to your own storage and point to it with -d flag of the run script. also, you may download the databases to your own storage via the script download_all_data.sh located at /home/alphafold_folder/alphafold_multimer_non_docker/scripts/
Sample qsub script
#!/bin/bash #PBS -l select=1:ncpus=4:ngpus=1 #PBS -q gpu # load miniconda module load miniconda/miniconda3-4.7.12-environmentally # activate relevant venv conda activate /powerapps/share/centos7/miniconda/miniconda3-4.7.12-environmentally/envs/alphafold_non_docker # run alphafold bash /home/alphafold_folder/alphafold_multimer_non_docker/run_alphafold.sh -d /tzachi_storage/evgenyf/alphafold/alphafold_data -o /home/alphafold_folder/alphafold_multimer_non_docker/dummy_test/ -f /home/alphafold_folder/alphafold_multimer_non_docker/example/query.fasta -t 2020-05-14