Pre-distribution version, for Linux, Mac OS X and Windows. Contact Rob Nicholls for more information.
This software is under development - if any aspect of the functionality/implementation is considered undesirable, or if any bugs, strange behavior or unexpected results are encountered, please email - any comments, questions and suggestions are very much appreciated.
ProSMART (Procrustes Structural Matching Alignment and Restraint Tool) is a software tool designed for the conformation-independent structural comparison of protein chains. At current, ProSMART has two components:
ProSMART ALIGN allows the pairwise alignment of chain-pairs, and also allows the batch-processing of multiple pairwise alignments. Similarity is based on the conservation of local structure, and is consequently independent of global conformation. For each chain-pair, multiple superpositions may be provided: a global superposition based on the achieved alignment, and a superposition for each identified common rigid substructure (e.g. domain), if any are identified. Output includes residue-based, sidechain-based and global similarity scores, thus providing a multi-resolution view of chain similarity. Residue-based and sidechain-based scores may be viewed in color, using the graphics software PyMOL. Experimental ProSMART analysis features are under development and available in the latest versions of CCP4mg.
ProSMART RESTRAIN uses the results from ProSMART ALIGN in order to generate restraints for a target protein, based on one (or multiple) other homologous protein structures. Restraints may also be generated for individual fragments (e.g. secondary structure elements). The use of restraints from ProSMART is intended to improve refinement by REFMAC5. Note that parameter tweaking in refmac will almost certainly be required. If you have a higher-resolution structure that is well-refined, then you can attempt to use information from this structure in order to improve reliability of refinement, providing this "external reference" structure is sufficiently similar to the "target" structure that you are trying to refine. Generally, external structures should be identical or close homologs, although in theory even structures with low sequence similarity might be used under the employed formalism; the user must decide what is sensible/appropriate.
A basic understanding of directory navigation in the unix command line is assumed.
.tar.gz
file:
ProSMART is provided as a prosmart_source_XXX.tar.gz
file (with the X's replaced by the appropriate date). After downloading
this file, navigate to the location where you downloaded this file,
using the command line. It may be unpacked by typing (with the X's
replaced, as appropriate):
tar zxf prosmart_source_XXX.tar.gz
A directory called ProSMART/
will have been created within the current working directory (this may be confirmed by typing ls
to list the contents of the current working directory). In the rest of this installation guide, we shall refer to the ProSMART/
directory as the 'ProSMART directory'.
Installation of ProSMART is achieved using the makefile located in the ProSMART directory. On most systems, the default configuration settings will be suitable. However, on other systems it may be necessary or desirable to alter the ProSMART's installation directories, particularly if you don't have sufficient permissions to install into the default directories (which may require superuser/administrator privileges).
To reconfigure ProSMART, open the makefile (located: ProSMART/makefile
) using a text editor. Near the top of this file, you will see these lines (or similar, depending on the particular version):
BIN_DIR = /usr/local/bin/
LIB_DIR = /usr/local/share/
You may choose to change any of the locations specified by these
variables. For example, to put binaries/library in CCP4 directories, set
BIN_DIR = $(CBIN)
, and LIB_DIR = $(CLIBD)
(sufficient privileges will be required to install). For more information, see the makefile (version 0.807 or later).
The BIN_DIR
variable specifies the location that the
ProSMART binaries will be copied to upon installation. It is important
that this directory is specified by the PATH
environment variable, so that ProSMART can be executed from any directory. Type echo $PATH
to list the directories specified in the PATH
.
LIB_DIR
variable specifies the location that the
ProSMART library will be copied to upon installation. This can be any
directory that has appropriate permissions.
Installation of ProSMART is done using the makefile located in the ProSMART directory.
ProSMART/
).
Type make
and hit enter. This will compile ProSMART. Upon success, three binaries will be created in the ProSMART directory: prosmart
, prosmart_align
, and prosmart_restrain
.
Type make install
and hit enter. Upon success, this will copy the three binaries to BIN_DIR
and copy the library to LIB_DIR
(as specified in the makefile).
Optional: to clean up the ProSMART directory, type make clean
and hit enter. This will remove all created object files and binaries from the current directory.
Upon successful installation, ProSMART can be executed by typing: prosmart
. This will display a list of command line arguments, equivalent to typing prosmart -help
.
If installation was unsuccessful, the most common problem is that you
don't have enough privileges to write to the desired directories. In
this case, you may wish to ask for help from your system administrator,
or obtain appropriate permissions (e.g. super user or administrator
privileges, as required; e.g. typing sudo make install
may
work, providing you have super user privileges). Otherwise, you could
reconfigure the makefile, changing the ProSMART installation directories
to some directories that you do have appropriate permissions for (see above).
The experimental Windows version of ProSMART has been tested on Windows 7, and assumes that the CCP4 suite is installed. For more information on installation without CCP4 (or on different versions of Windows), please contact the author. The Windows version of ProSMART does not support the simultaneous execution of multiple child processes (for better performance, try using the Linux/MacOSX version).
Unzip the prosmart_windows_XXX.zip
file (in Windows 7, right click on the compressed folder and select the "Extract All..." option).
From within the unzipped folder, double-click the install.bat
file (warning: do not attempt to run install.bat
from within the compressed folder).
The install.bat
batch file will copy the three ProSMART executables prosmart.exe
, prosmart_align.exe
, and prosmart_restrain.exe
, and the directory Prosmart_Library
into the CCP4 directory for binaries (specified by the CBIN
environment variable, e.g. C:\CCP4-Packages\ccp4-6.2.0\bin
). Upon successful installation, ProSMART should be runnable from the Command Prompt (by typing prosmart
),
and from within CCP4i, if the appropriate CCP4i task interfaces have
been installed. To uninstall ProSMART, double-click the supplied uninstall.bat
file, which remvoes ProSMART from CBIN
.
If any unknown errors occur, or if you would like assistance, please seek help from your system administrator or contact the author.
.tar.gz
file, open CCP4i.
In the "System Administration" drop-down menu on the right side of the screen, select "Install/uninstall Tasks", which should open a pop-up window.
Select "Run the Installation Manager to Install a new task", and "Perform automatic installation of tasks into user's local CCP4i area" (if wanting to install for all users, change "user's local CCP4i" to "main CCP4i").
In the "Task archive" field, browse to find the downloaded .tar.gz
file (the name of the package and version number should automatically appear). Click "Apply".
Restart CCP4i.
Note: it may be advisable to uninstall any existing interface versions before attempting to install a new task interface.
Run ProSMART. From the command line, run the executable prosmart
with the appropriate arguments (see below). Arguments can be either passed to the program as command line arguments and/or via a configuration file (see -f
argument below).
Note: never run the prosmart_align
or prosmart_restrain
binaries - these are called automatically by prosmart
.
The residue alignment must be reasonable in order to successfully
generate external restraints for use in refinement. Consequently, if
external restraints are to be generated, it is advised to manually check
the ProSMART residue alignment (by viewing the outputted pdb files in
PyMOL, with the outputted PyMOL color scripts) to confirm it is correct
(or at least reasonable) by checking the generated alignment file. If it
is not, then try different alignment parameters or seek help.
Importantly, note also that the argument -id
can be used to
force the correct alignment of sequence-identical structures, which is
particularly useful when intending to generate restraints using
identical structures, as well as for other applications.
Output is communicated through a HTML-format results page, called ProSMART_Results.html
, which may be found in the output directory. The log files automatically generated by ProSMART (prosmart_align_logfile.txt
, prosmart_restrain_logfile.txt
) provide useful information, indicate which files have been created, and may help any troubleshooting.
The strength and qualitative nature of the generated atomic bond restraints (and other parameters) can be adjusted within ProSMART (see below), and can also be adjusted within REFMAC5 (version 5.7.0005 or later). These parameters may be played with in order to get a successful refinement using the external restraints. Appropriate parameters will be dependent on your particular case - data quality, resolution, similarity of external structure, etc.
Suppose you are trying to refine a low-resolution structure mypdb1.pdb
, and want to utilise information from a known higher-resolution structure mypdb2.pdb
during refinement. The simplest example of aligning and generating restraints for all chains in mypdb1.pdb
using all chains in mypdb2.pdb
as external information is:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb
For the alignment of mypdb1
chain A
with mypdb2
chain B
, the alignment and generation of external restraints from mypdb2_B
for use in the refinement of mypdb1_A
can be achieved as follows:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -c1 A -c2 B
To perform alignment, but not generate restraints, use the -a
argument:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -a
To perform generate restraints, but not perform alignment, use the -r
argument (note that a valid alignment file must already exist! This
functionality is useful if you want to edit the alignment before
generating restraints based on the alignment):
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -r
The chains do not need to be specified - to perform pairwise alignment of all chains from mypdb1.pdb
with all chains in mypdb2.pdb
, then generate restraints for all chains in mypdb1.pdb
, the following command would be appropriate:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb
If the second PDB file is not specified, then an all-on-all alignment will be performed. All chains within mypdb1.pdb
will be pairwise aligned, and restraints generated for each other, e.g. this is valid:
prosmart -p1 mypdb1.pdb
Note that, in an all-on-all alignment, each chain will not be aligned/restrained to itself by default.
For fragment motif alignment, and generation of motif restraints, appropriate syntax may be:
prosmart -p1 mypdb1.pdb -helix
This command will identify sufficiently helical regions, and generate
the corresponding restraints. Note that the default library consists of
two entries: an ideal helix and a typical strand. These are located in
the ProSMART library. This library (and alignment rules) may be edited
and extended by the user. To run ProSMART using all fragments in the
local library, use the -lib
argument. Note also that use of the -lib
or -helix
arguments automatically overrides any secondary reference PDB files specified using the -p2
argument.
It is possible to specify multiple PDB files and chains in order to
automatically perform multiple pairwise alignments, e.g. this is valid:
prosmart -p1 mypdb1.pdb mypdb2.pdb -p2 mypdb3.pdb mypdb4.pdb mypdb5.pdb
If more than one PDB file is specified and chains are specified then the
PDB files must be repeated for each chain so that a one-to-one
correspondence exists between the arguments of -p1
and -c1
, and between -p2
and -c2
. E.g. this is a valid command to perform an all-on-all alignment of mypdb1A
, mypdb2A
and mypdb2B
:
prosmart -p1 mypdb1.pdb mypdb2.pdb mypdb2.pdb -c1 A A B -a
However, this is not:
prosmart -p1 mypdb1.pdb mypdb2.pdb -c1 A A B -a
If only one PDB file is specified then multiple chains may be selected, e.g. this is valid:
prosmart -p1 mypdb1.pdb -c1 A B C -a
These rules apply separately for the target (-p1
and -c1
) and the secondary (-p2
and -c2
) inputs.
Suppose you have two structures: the low-resolution structure that you
want to refine (target.pdb) and a sequence-identical higher-resolution
structure that you want to use as prior information (external.pdb). Then
to generate restraints for all chains in target.pdb, using all chains
from external.pdb, type:
prosmart -id -p1 target.pdb -p2 external.pdb
.
The "-id
" keyword indicates that the structures are
sequence-identical, and so it will assume that the residues are
equivalent, bypassing the alignment stage and assuming equivalence of
residue numbering. This should be used for identical structures.
However, the "-id
" keyword should not be used if the
structures are non-identical in sequence. If the structures are
homologous in structure but non-identical in sequence then do not use the "-id
" keyword, as the generated restraints would be non-sensical and likely destructive during refinement.
You can do other things, like generate restraints using only the best-scoring chains ("-restrain_best
").
However, note that it may be reasonable to use the default all-on-all
chain restraint generation in some cases, since REFMAC5 currently
selects only the best restraint for each atom-pair when multiple
restraints are generated.
An example of a possible execution of REFMAC5 with external restraints is:
refmac5 \
XYZIN pdb_in.pdb \
HKLIN mtz_in.mtz \
XYZOUT pdb_out.pdb \
HKLOUT mtz_out.mtz \
<<EOF
NCYC 20
EXTERNAL USE MAIN
EXTERNAL DMAX 4.2
EXTERNAL WEIGHT SCALE 10
EXTERNAL WEIGHT GMWT 0.15
@prosmart_restraints_file.txt
MONI DIST 1000000
END
EOF
Description of used keywords:
NCYC 20 |
Number of REFMAC5 refinement cycles. Note that external restraints will seem to have little effect if using only few refinement cycles (e.g. 5). Something like 20-40 cycles may be required, and even more if also using jelly-body restraints (see below). |
EXTERNAL USE MAIN |
Discards any side-chain restraints that may be present in the external restraints file (remove this keyword to keep side-chains restraints; the EXTERNAL USE ALL keyword can be used to ensure that all restraints will be used). This may or may not be appropriate, depending on the particular case. If used, side-chain restraints will generally have a larger effect on refinement (note that the optimal weight and Geman-McClure parameter values will be very different if using side-chain restraints). Recommendations: if at early/intermediate stages of refinement, and just trying to stabilise the conformation of a poorly-defined backbone in poor density, only use main-chain restraints at first (i.e. use EXTERNAL USE MAIN ); otherwise, try using both main and side-chain restraints (i.e. remove the REFMAC5 keyword: EXTERNAL USE MAIN , and make sure the ProSMART keyword: -side is used). |
EXTERNAL DMAX 4.2 |
Maximum restraint interatomic distance. Highly recommended value: 4.2. |
EXTERNAL WEIGHT SCALE 10 |
External restraints weight. Increasing this weight increases the influence of external restraints during refinement. Optimal value: varies dramatically - note that the value 10 shown here is rather arbitrary. |
EXTERNAL WEIGHT GMWT 0.15 |
Geman-McClure parameter, which controls robustness to outliers. Increasing this value reduces the influence of outliers (i.e. restraints that are very different from the current interatomic distance). However, increasing this value too high results in too many restraints being considered outliers - this means that only the restraints that are very similar to the current interatomic distances will have much effect. Optimal value: varies dramatically - note that the value 0.15 shown here is rather arbitrary. Note also that the optimal value is highly correlated with the external restraints weight parameter. |
@prosmart_restraints_file.txt |
Specifies the location of the external restraints file generated by ProSMART, e.g. called prosmart_restraints_file.txt . This file will be located in the ProSMART output directory (by default: ProSMART_Output/ ). Only the EXTERNAL keywords supplied before specifying this file will be applied to the external restraints in this file.
|
MONI DIST 1000000 |
Only monitor distances (i.e. identify individual restraints as potential outliers in the log file) greater than this value relative to the restraint sigma (i.e. 1000000*sigma in this example). Consequently, since the default value is 10, many unnecessary extra lines will be written to the REFMAC5 log file by default when using external restraints - this is avoided by setting the value of MONI DIST to be arbitrarily high. Note that outliers are expected when using external restraints (and the use of the Geman-McClure robust estimator function allows true outliers to be dealt with automatically).
|
Appropriate values of EXTERNAL WEIGHT SCALE
and EXTERNAL WEIGHT GMWT
will depend on many factors, including the existing geometry weight, the resolution and quality of the data and reference structure(s), the number of NCS-restrained chains, and the local similarity between the target and external structures. Different values will generally have to be tried in order to have any chance of successfully using external restraints.
For more information about external restraints, including the external restraints weight and the Geman-McClure parameter, see Nicholls et al. (2012).
Note that the "RIDGE DISTANCE SIGMA 0.1
" keyword can be used to specify for jelly-body restraints (also called harmonic restraints) to be used in regions where there are no external restraints. For more information, see Murshudov et al. (2011), and for more information about the RIDGE
keywords, see here.
-a |
run only ProSMART ALIGN, default alignment method. This uses all mainchain atoms for superposition and scoring. |
-a1 |
run only ProSMART ALIGN, alternative method 1 (faster). This uses only C-alpha atoms for superposition and scoring. |
-a2 |
run only ProSMART ALIGN, alternative method 2 (slower). This uses all mainchain atoms for superposition, then uses only C-alpha atoms for scoring. |
-r |
run only ProSMART RESTRAIN (assumes alignment exists in correct location). |
-o [dir path] |
directory for output files relative to current directory (default: ./ProSMART_Output/ ). Directories will be created if they do not exist. |
-f [file path] |
external text file used as an alternative way to provide arguments to the program. Arguments can be space and/or newline separated. |
-threads [integer] |
max number of threads - max number of child processes to be executed simultaneously by ProSMART. For optimal performance, this value should be equal to the number of logical cores (threads) in the CPU. Alternatively, this value may be reduced in order to free cpu resources, increasing system performance at the expense of ProSMART. |
-xml |
output a log file in XML format, called xmlout.xml , into the current working directory. |
-xml [file path] |
output a log file in XML format, to the location specified. |
-refmac [exec. name] |
name of REFMAC binary executable (default: refmac5 ). |
-nmr |
specifies that input structure is an NMR/MD ensemble (this functionality is experimental). Only one target PDB file should be specified, and no external reference. The first state (chain) will be used as target, and all other states as secondary references. |
-merge |
allows oligomers/complexes to be considered as individual structural units by merging multiple chains within a PDB file. This is achieved by combining all (or desired) chains into a single chain; all residue numbers are renamed accordingly. |
-p1 [file path] |
location of target PDB file(s) (required). |
-p2 [file path] |
location of secondary external reference PDB file(s). |
-c1 [char] |
chains of interest in PDB file(s) 1 (if unspecified, all will be used). |
-c2 [char] |
chains of interest in PDB file(s) 2 (if unspecified, all will be used). |
-id |
specifies that the chains to be aligned are sequence-identical. More specifically, assumes that the residue numbering is the same between the two structures. |
-allonall |
if only target chains are specified (i.e. -p2 is not
specified) then by default all-on-all comparison of target chains will
be performed, using only the upper triangular half-matrix of chain-pair
combinations. Specifying the -allonall argument will result in the full matrix of combinations being used. |
-align [pdb] [chain(s)] [resnum] [resnum] |
specify residue ranges, so that only portion of PDB files may be
used during alignment (applied during PDB file processing - simply
removes residues outside this range). The -align keyword may be used multiple times, in order to specify multiple ranges. The resnum arguments must be integer (i.e. cannot specify an insertion code - combine with the -align_rm keyword for more flexibility with insertion codes). Can use the string ALL for the pdb and/or chain arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the chain(s) argument, e.g. the string ABC would specify for the instance of -align to be applied to chains A , B , and C , if they exist, and not applied to any other chains. Note that different instances of the -align keyword may be used in order to filter different regions from different chains.
|
-align_rm [pdb] [chain(s)] [resnum(ins)] |
specify individual residues to be removed (applied during PDB file
processing - simply removes these residues from consideration prior to
structural alignment). The -align_rm keyword may be used multiple times, in order to remove multiple residues. The resnum argument may optionally be appended by an insertion code. Can use the string ALL for the pdb and/or chain arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the chain(s) argument, e.g. the string ABC would specify for the instance of -align_rm to be applied to chains A , B , and C , if they exist, and not applied to any other chains. Note that different instances of the -align_rm keyword may be used in order to filter different regions from different chains.
|
-lib |
specifies to run alignment of all fragments in the ProSMART library
instead of pairwise alignment of two chains (any secondary PDB file(s)
specified using -p2 will be ignored).
|
-helix |
specifies to run fragment alignment of the helix entry in the
ProSMART library instead of pairwise alignment (any secondary PDB
file(s) specified using -p2 will be ignored). This is a subset of -lib , which uses only the representative helix fragment instead of all fragments in the library.
|
-strand |
specifies to run fragment alignment of the strand entry in the
ProSMART library instead of pairwise alignment (any secondary PDB
file(s) specified using -p2 will be ignored). This is a subset of -lib , which uses only the representative strand fragment instead of all fragments in the library.
|
-library_config [file] |
provide a library configuration file, to use instead of the default config.txt located in the ProSMART_Library/ directory.
|
-library [dir_path] |
provide the location of a library to be used instead of the default ProSMART_Library/ .
|
-lib_score [double] |
override all Procrustes score thresholds in the fragment library (warning - will be applied to all fragments). |
-lib_fraglen [int] |
override all fragment lengths in the fragment library (warning - will be applied to all fragments). |
-len [integer] |
length of fragment, in residues (default 9 , but is overriden by values in the ProSMART library config.txt file if -lib is specified). Must be odd. |
-score [double] |
Procrustes score dissimilarity threshold. Can be used to remove regions from the alignment that are not locally conserved. By default, no threshold is used. |
-superpose [double] |
only fragments with Procrustes scores less than this value will be used for producing the global superposition. |
-helixcutoff [double] |
helix-sharing is used to help align pairs of structures. This value is the cutoff for defining helix similarity at the chosen fragment length (note: this has nothing to do with helix alignment). |
-helixpenalty [double] |
helix-sharing is used to help align pairs of structures. This value is the dynamic alignment penalty for pairs of helix fragments (note: this has nothing to do with helix alignment). |
-no_reward_seq |
specifies to perform pure structure-based alignment, ensuring that sequence conservation has no influence. |
-reward_seq [double] |
fragment-pairs whose corresponding residues are all sequence-identical are assigned a fixed dissimilarity score (which may be negative) instead of the ordinary Procrustes score during alignment. This effectively forces the alignment to obey sequence conservation, for structure-pairs with very high sequence identity. |
-skip_refine |
specifies that fragment-based alignment refinement and residue-based alignment optimisation should not be performed, taking the raw alignment resulting from dynamic programming. |
-main_dist |
specifies for all main chain atoms (N, CA, C and O) to contribute to side chain scores. Only CA is included by default. |
-num_dist [double] |
interatomic distance threshold for the 'NumDist' score, which is the number of corresponding side chain atoms that deviate more than the threshold, after local backbone superposition. |
-no_fix_errors |
specifies to not account for potential side chain flips during scoring. By default, flips are applied in order to achieve the lowest net interatomic distance after local superposition. |
-cluster_skip |
do not perform the rigid substructure identification functionality. |
-cluster_score [double] |
Procrustes dissimilarity score threshold; only fragments with scores below this value will be used for rigid substructure identification. |
-cluster_angle [double] |
intrafragment rotational dissimilarity score threshold (unit: angle in degrees); only fragments with scores below this value will be used for rigid substructure identification. |
-cluster_min [double] |
minimum number of fragments (size of cluster) when performing rigid substructure identification. |
-cluster_link [double] |
single linkage clustering threshold for rigid substructure identification (unit: cosine of angle). |
-cluster_rigid [double] |
controls final cluster rigidity of identified rigid substructures (unit: cosine of angle). Increasing this value will force the superposition to agree better with fewer fragments in the centre of the cluster, rather than being based on agreement with the whole substructure. Higher values are better for superposition, although may result in rigid substructures not being identified at all. |
-cluster_color [double] |
scales cluster colour resolution (i.e. dissimilarity threshold) in outputted PyMOL colour scripts. |
-output_dm |
output cluster distance matrices, for subsequent inspection using other software (e.g. R). |
-cosine |
display intrafragment rotation scores as a cosine distance ( = 1 - cos(theta) ), rather than as the default angle (degrees). |
-quick |
do not output superposed PDB files and PyMOL colour code scripts. |
-out_pdb |
output superposed PDB files. |
-out_pdb_full |
any output superposed PDB files will comprise all chains, rather than just the particular chain of interest (i.e. the one aligned and superposed). |
-out_color |
output PyMOL colour code scripts. |
-colour_score [double] |
adjusts the color resolution in the output PyMOL color files corresponding to main chain scores. |
-side_score [double] |
adjusts the color resolution in the output PyMOL color files corresponding to side chain scores. |
-col1 [d. d. d.] |
RGB colour code for used for defining similarity. |
-col2 [d. d. d.] |
RGB colour code for used for defining dissimilarity. |
-self_restrain |
generates generic self-restraints for the primary structures, ignoring any secondary input structures. This feature is generalised and may be applied to any molecules, e.g. can be used for DNA/RNA. |
-restrain_all |
generates restraints for all target chains, using all secondary chains as external reference structures. |
-restrain_best |
only use the best external reference chain. For each of the target chains, only one of the external reference chains is used for restraint generation. Which chain is the best is determined by the global alignment score, which is based on net agreement of local structure, independent of global conformation. |
-restrain_to_self |
by default, self-restraints are not generated (i.e. restraints for pdb1.pdb chain A will not be generated using pdb1.pdb chain A as an external reference). This parameter allows self-restraints to be generated. |
-rmax [double] |
max restraint distance - size of sphere around atom in which restraints can exist. |
-rmin [double] |
min restraint distance - restraints of length below this value are not included. |
-sigma [double] |
default sigma used to weight atomic distance restraints in external refinement (only if sigma parameter estimation is not used, or fails to find a suitable solution). |
-min_sigma [double] |
minimum possible value of sigma, which overrides sigma parameter estimation where appropriate. |
-sigmatype [integer] |
possible values are 0 , 1 or 2 . Description: 0 : use default constant sigma. 1 : estimate constant sigma by maximum likelihood estimation. 2 : generate distance-dependent sigmas by maximum likelihood estimation. |
-cutoff [double] |
alignment score cutoff for restraints (default 10.0 , which effectively turns this feature off) - filters out restraints corresponding to residues with worse Procrustes scores. |
-side_cutoff [double] |
"side chain average" score cutoff for restraints (default 10.0 , which effectively turns this feature off) - filters out restraints corresponding to residues with worse scores. |
-multiplier [double] |
value (default 10.0 , which effectively turns this
feature off) that is multiplied by the restraint sigma in order to
determine a cutoff value with which to filter restraints that are too
far from the original distance. This effectively removes extreme
outliers. |
-weight [double] |
scales sigmas so that restraints have different weighting (default 1.0 ). The lower the value, the higher the influence that the restraints have during refinement. |
-rm_bonds |
specifies that restraints on bonds/angles will be removed (REFMAC5 may be executed to generate list of bonded atom-pairs, if required). If bonds/angles are not removed, then default sigmas will be used, since the assumptions for estimated sigmas will be violated. Specification to use estimated sigmas will by default cause bonds/angles to be removed. |
-main |
specifies that only restraints for main chain atoms should be generated; side chain atoms will be ignored. |
-side |
specifies that restraints for both main chain and side chain atoms should be generated. |
-restrain [pdb] [chain(s)] [resnum] [resnum] |
specify residue ranges, so that restraints are only generated for a portion of a structure (similar to the -align
keyword, only the filtering is applied at the end of the process after
structural alignment - applies a filter to final restraints lists). The -restrain keyword may be used multiple times, in order to specify multiple ranges. The resnum arguments must be integer (i.e. cannot specify an insertion code - combine with the -restrain_rm keyword for more flexibility with insertion codes). Can use the string ALL for the pdb and/or chain arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the chain(s) argument, e.g. the string ABC would specify for the instance of -restrain to be applied to chains A , B , and C , if they exist, and not applied to any other chains. Note that different instances of the -restrain keyword may be used in order to filter different regions from different chains.
|
-restrain_rm [pdb] [chain(s)] [resnum(ins)] |
specify individual residues to be removed (similar to the -align_rm
keyword, only the filtering is applied at the end of the process after
structural alignment - simply removes these residues from the final
restraints lists). The -restrain_rm keyword may be used multiple times, in order to remove multiple residues. The resnum argument may optionally be appended by an insertion code. Can use the string ALL for the pdb and/or chain arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the chain(s) argument, e.g. the string ABC would specify for the instance of -restrain_rm to be applied to chains A , B , and C , if they exist, and not applied to any other chains. Note that different instances of the -restrain_rm keyword may be used in order to filter different regions from different chains.
|
-output_pdb_chain_restraints |
Further to chain-chain restraints files and the final pdb-all restraints files, also output the intermediate pdb-chain restraints files. This is disabled by default, to save disk space. |
-no_copy |
don't copy final restraints files to main output dir. |
-type [int] |
Specifies the REFMAC5 restraint type (as specified here). Possible values: 0 (bond type restraints - replace existing), 1 (bond type restraints - add to existing), and 2 (external restraints - default, recommended). |
-h
is specified then general main-chain h-bond restraints, including those for helices and sheets, will be generated together. In contrast, if -h_helix
and -h_sheet
are both specified then restraints for helices and sheets will be generated separately.
-p2
, -c2
, -lib
, -helix
, -strand
.
-strict
keyword is used.
-h (or -bond ) |
generate generic bond restraints. By default, these restraints represent h-bonds, and are generated for the whole main chain, including all helices, sheets, loops, etc. according to detected hydrogen bonding patterns. |
-h_helix |
generate generic bond restraints for helices. This includes all types of helices (not just alpha). Types of helix restraints can be specified using keywords -3_10 , -alpha , or -pi , or alternatively by manually specifying the residue separation between restrained atom-pairs (see keywords: -min_sep , -max_sep , -allow_sep , -rm_sep , -bond_opt ). |
-h_sheet |
generate generic bond restraints across beta-sheets. |
-strict |
require strict structural conservation to helix/strand conformations in order for generic restraints to be generated for those regions. Uses fragment library to determine which regions are sufficiently helical/strand-like. |
-3_10 |
generate restraints for potential 3_10-helices. Specifically, requires residue separation of 3 residues. |
-alpha |
generate restraints for potential alpha-helices. Specifically, requires residue separation of 4 residues. |
-pi |
generate restraints for potential pi-helices. Specifically, requires residue separation of 5 residues. |
-bond_dist [double] |
target value of the generic bond restraints. |
-bond_min [double] |
minimum interatomic distance for restrained atom-pairs. |
-bond_max [double] |
maximum interatomic distance for restrained atom-pairs. |
-min_sep [int] |
minimum number of residues between restrained atom-pairs. |
-max_sep [int] |
maximum number of residues between restrained atom-pairs. |
-allow_sep [int]... |
specify to allow only specific number(s) of residues between restrained atom-pairs. |
-rm_sep [int]... |
specify to disallow specific number(s) of residues between restrained atom-pairs. |
-bond_opt [int] |
controls how atom-pairs are selected, i.e. which atom-types can form bonds. Possible values: 1 (only O-N restraints - suitable for all helices), 2 (both O-N and N-O restraints - suitable for beta-sheets), 3 (do not filter - allow generic bond restraints between any main chain atoms). |
-bond_override [int] |
overrides the default number of allowed bonds per atom. Default is 1 for N, 2 for O. |
-troubleshoot_restraint_files |
|
-troubleshoot_hbond_restraints |
-renamechain [char] |
only for use in special cases where the target PDB file (specified with -p1 )
has only one chain, but does not have a chain_ID letter specified. This
option can be used to regenerate the PDB file with a chain_ID as
specified by the user. Other ProSMART functionalities will not be
executed if -renamechain is specified. ProSMART should then be re-executed with the newly created PDB file. |
If alternative residue conformations are detected, only the first conformation present is used for alignment and scoring.
Alignments achieved using ProSMART are forced to maintain order of sequence.
If ProSMART is executed more than once with the same PDBs/chains,
then files generated during previous executions will be overwritten,
even if some of the command line arguments are different. If it is not
desired for files to be overwritten, then it is recommended for
different output directories to be used (this can be achieved using the -o
argument).
When performing more than one alignment in a single ProSMART
execution, the pairwise alignments are by default executed in parallel
as multiple jobs in order to utilise the multi-threading capabilities of
modern multi-core processors (preferences can be set using the -threads
argument). This allows a dramatic increase in the usage of system
resources and processing power. Consequently, specifying for multiple
chain alignments to be executed concurrently is often much quicker than
performing single pairwise alignments consecutively (performance scales
approximately linearly with the number of physical cores in the cpu, and
with cpu frequency). Furthermore, if both alignment and restraint
generation are performed in the same execution, and atomic bonds are to
be generated, then the generation of atomic bonds (using REFMAC5) and
alignment of structures will be performed in parallel in order to
increase efficiency.