ProSMART (version 0.8) Documentation

Pre-distribution version, for Linux, Mac OS X and Windows. Contact Rob Nicholls for more information.

This software is under development - if any aspect of the functionality/implementation is considered undesirable, or if any bugs, strange behavior or unexpected results are encountered, please email - any comments, questions and suggestions are very much appreciated.

Summary
Installation Instructions for Linux and Mac OS X
Installation Instructions for Windows
Installation Instructions for CCP4i task interfaces
Operational Instructions
Quick Basic Tutorial
Using External Restraints With REFMAC5
ProSMART Command Line Arguments
Further Notes On Methodology, Functionality And Implementation
References and Links

Summary:

ProSMART (Procrustes Structural Matching Alignment and Restraint Tool) is a software tool designed for the conformation-independent structural comparison of protein chains. At current, ProSMART has two components:

ProSMART ALIGN - for the alignment, superposition, and scoring of protein chains;
ProSMART RESTRAIN - for the generation of external restraints for use in the crystallographic refinement of protein structures.

ProSMART ALIGN allows the pairwise alignment of chain-pairs, and also allows the batch-processing of multiple pairwise alignments. Similarity is based on the conservation of local structure, and is consequently independent of global conformation. For each chain-pair, multiple superpositions may be provided: a global superposition based on the achieved alignment, and a superposition for each identified common rigid substructure (e.g. domain), if any are identified. Output includes residue-based, sidechain-based and global similarity scores, thus providing a multi-resolution view of chain similarity. Residue-based and sidechain-based scores may be viewed in color, using the graphics software PyMOL. Experimental ProSMART analysis features are under development and available in the latest versions of CCP4mg.

ProSMART RESTRAIN uses the results from ProSMART ALIGN in order to generate restraints for a target protein, based on one (or multiple) other homologous protein structures. Restraints may also be generated for individual fragments (e.g. secondary structure elements). The use of restraints from ProSMART is intended to improve refinement by REFMAC5. Note that parameter tweaking in refmac will almost certainly be required. If you have a higher-resolution structure that is well-refined, then you can attempt to use information from this structure in order to improve reliability of refinement, providing this "external reference" structure is sufficiently similar to the "target" structure that you are trying to refine. Generally, external structures should be identical or close homologs, although in theory even structures with low sequence similarity might be used under the employed formalism; the user must decide what is sensible/appropriate.

Installation Instructions for Linux and Mac OS X:

A basic understanding of directory navigation in the unix command line is assumed.

How to unpack the ProSMART `.tar.gz` file:

ProSMART is provided as a prosmart_source_XXX.tar.gz file (with the X's replaced by the appropriate date). After downloading this file, navigate to the location where you downloaded this file, using the command line. It may be unpacked by typing (with the X's replaced, as appropriate):
tar zxf prosmart_source_XXX.tar.gz

A directory called ProSMART/ will have been created within the current working directory (this may be confirmed by typing ls to list the contents of the current working directory). In the rest of this installation guide, we shall refer to the ProSMART/ directory as the 'ProSMART directory'.

How to configure ProSMART for installation on your system (optional):

Installation of ProSMART is achieved using the makefile located in the ProSMART directory. On most systems, the default configuration settings will be suitable. However, on other systems it may be necessary or desirable to alter the ProSMART's installation directories, particularly if you don't have sufficient permissions to install into the default directories (which may require superuser/administrator privileges).

To reconfigure ProSMART, open the makefile (located: ProSMART/makefile) using a text editor. Near the top of this file, you will see these lines (or similar, depending on the particular version):
BIN_DIR = /usr/local/bin/ LIB_DIR = /usr/local/share/

You may choose to change any of the locations specified by these variables. For example, to put binaries/library in CCP4 directories, set BIN_DIR = $(CBIN), and LIB_DIR = $(CLIBD) (sufficient privileges will be required to install). For more information, see the makefile (version 0.807 or later).

The BIN_DIR variable specifies the location that the ProSMART binaries will be copied to upon installation. It is important that this directory is specified by the PATH environment variable, so that ProSMART can be executed from any directory. Type echo $PATH to list the directories specified in the PATH.

The LIB_DIR variable specifies the location that the ProSMART library will be copied to upon installation. This can be any directory that has appropriate permissions.

How to compile and install ProSMART:

Installation of ProSMART is done using the makefile located in the ProSMART directory.

In the unix terminal, navigate into the ProSMART directory (ProSMART/).

Type make and hit enter. This will compile ProSMART. Upon success, three binaries will be created in the ProSMART directory: prosmart, prosmart_align, and prosmart_restrain.
Type make install and hit enter. Upon success, this will copy the three binaries to BIN_DIR and copy the library to LIB_DIR (as specified in the makefile).
Optional: to clean up the ProSMART directory, type make clean and hit enter. This will remove all created object files and binaries from the current directory.

Upon successful installation, ProSMART can be executed by typing: prosmart. This will display a list of command line arguments, equivalent to typing prosmart -help.

If installation was unsuccessful, the most common problem is that you don't have enough privileges to write to the desired directories. In this case, you may wish to ask for help from your system administrator, or obtain appropriate permissions (e.g. super user or administrator privileges, as required; e.g. typing sudo make install may work, providing you have super user privileges). Otherwise, you could reconfigure the makefile, changing the ProSMART installation directories to some directories that you do have appropriate permissions for (see above).

If any unknown errors occur, or if you would like assistance, please seek help from your system administrator or contact the author.

Installation Instructions for Windows:

The experimental Windows version of ProSMART has been tested on Windows 7, and assumes that the CCP4 suite is installed. For more information on installation without CCP4 (or on different versions of Windows), please contact the author. The Windows version of ProSMART does not support the simultaneous execution of multiple child processes (for better performance, try using the Linux/MacOSX version).

How to install ProSMART:

Unzip the prosmart_windows_XXX.zip file (in Windows 7, right click on the compressed folder and select the "Extract All..." option).
From within the unzipped folder, double-click the install.bat file (warning: do not attempt to run install.bat from within the compressed folder).

The install.bat batch file will copy the three ProSMART executables prosmart.exe, prosmart_align.exe, and prosmart_restrain.exe, and the directory Prosmart_Library into the CCP4 directory for binaries (specified by the CBIN environment variable, e.g. C:\CCP4-Packages\ccp4-6.2.0\bin). Upon successful installation, ProSMART should be runnable from the Command Prompt (by typing prosmart), and from within CCP4i, if the appropriate CCP4i task interfaces have been installed. To uninstall ProSMART, double-click the supplied uninstall.bat file, which remvoes ProSMART from CBIN.

If any unknown errors occur, or if you would like assistance, please seek help from your system administrator or contact the author.

Installation Instructions for CCP4i Task Interfaces:

After downloading the task interface .tar.gz file, open CCP4i.

In the "System Administration" drop-down menu on the right side of the screen, select "Install/uninstall Tasks", which should open a pop-up window.
Select "Run the Installation Manager to Install a new task", and "Perform automatic installation of tasks into user's local CCP4i area" (if wanting to install for all users, change "user's local CCP4i" to "main CCP4i").
In the "Task archive" field, browse to find the downloaded .tar.gz file (the name of the package and version number should automatically appear). Click "Apply".
Restart CCP4i.

Note: it may be advisable to uninstall any existing interface versions before attempting to install a new task interface.

Operational Instructions:

Run ProSMART. From the command line, run the executable prosmart with the appropriate arguments (see below). Arguments can be either passed to the program as command line arguments and/or via a configuration file (see -f argument below).

Note: never run the prosmart_align or prosmart_restrain binaries - these are called automatically by prosmart.

The residue alignment must be reasonable in order to successfully generate external restraints for use in refinement. Consequently, if external restraints are to be generated, it is advised to manually check the ProSMART residue alignment (by viewing the outputted pdb files in PyMOL, with the outputted PyMOL color scripts) to confirm it is correct (or at least reasonable) by checking the generated alignment file. If it is not, then try different alignment parameters or seek help. Importantly, note also that the argument -id can be used to force the correct alignment of sequence-identical structures, which is particularly useful when intending to generate restraints using identical structures, as well as for other applications.

Output is communicated through a HTML-format results page, called ProSMART_Results.html, which may be found in the output directory. The log files automatically generated by ProSMART (prosmart_align_logfile.txt, prosmart_restrain_logfile.txt) provide useful information, indicate which files have been created, and may help any troubleshooting.

The strength and qualitative nature of the generated atomic bond restraints (and other parameters) can be adjusted within ProSMART (see below), and can also be adjusted within REFMAC5 (version 5.7.0005 or later). These parameters may be played with in order to get a successful refinement using the external restraints. Appropriate parameters will be dependent on your particular case - data quality, resolution, similarity of external structure, etc.

Quick Basic Tutorial:

Suppose you are trying to refine a low-resolution structure mypdb1.pdb, and want to utilise information from a known higher-resolution structure mypdb2.pdb during refinement. The simplest example of aligning and generating restraints for all chains in mypdb1.pdb using all chains in mypdb2.pdb as external information is:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb

For the alignment of mypdb1 chain A with mypdb2 chain B, the alignment and generation of external restraints from mypdb2_B for use in the refinement of mypdb1_A can be achieved as follows:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -c1 A -c2 B

To perform alignment, but not generate restraints, use the -a argument:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -a

To perform generate restraints, but not perform alignment, use the -r argument (note that a valid alignment file must already exist! This functionality is useful if you want to edit the alignment before generating restraints based on the alignment):
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb -r

Notes on protocol:

The chains do not need to be specified - to perform pairwise alignment of all chains from mypdb1.pdb with all chains in mypdb2.pdb, then generate restraints for all chains in mypdb1.pdb, the following command would be appropriate:
prosmart -p1 mypdb1.pdb -p2 mypdb2.pdb
If the second PDB file is not specified, then an all-on-all alignment will be performed. All chains within mypdb1.pdb will be pairwise aligned, and restraints generated for each other, e.g. this is valid:
prosmart -p1 mypdb1.pdb

Note that, in an all-on-all alignment, each chain will not be aligned/restrained to itself by default.
For fragment motif alignment, and generation of motif restraints, appropriate syntax may be:
prosmart -p1 mypdb1.pdb -helix

This command will identify sufficiently helical regions, and generate the corresponding restraints. Note that the default library consists of two entries: an ideal helix and a typical strand. These are located in the ProSMART library. This library (and alignment rules) may be edited and extended by the user. To run ProSMART using all fragments in the local library, use the -lib argument. Note also that use of the -lib or -helix arguments automatically overrides any secondary reference PDB files specified using the -p2 argument.
It is possible to specify multiple PDB files and chains in order to automatically perform multiple pairwise alignments, e.g. this is valid:
prosmart -p1 mypdb1.pdb mypdb2.pdb -p2 mypdb3.pdb mypdb4.pdb mypdb5.pdb
If more than one PDB file is specified and chains are specified then the PDB files must be repeated for each chain so that a one-to-one correspondence exists between the arguments of -p1 and -c1, and between -p2 and -c2. E.g. this is a valid command to perform an all-on-all alignment of mypdb1A, mypdb2A and mypdb2B:
prosmart -p1 mypdb1.pdb mypdb2.pdb mypdb2.pdb -c1 A A B -a

However, this is not:
prosmart -p1 mypdb1.pdb mypdb2.pdb -c1 A A B -a

If only one PDB file is specified then multiple chains may be selected, e.g. this is valid:
prosmart -p1 mypdb1.pdb -c1 A B C -a

These rules apply separately for the target (-p1 and -c1) and the secondary (-p2 and -c2) inputs.

Using External Restraints With REFMAC5:

Suppose you have two structures: the low-resolution structure that you want to refine (target.pdb) and a sequence-identical higher-resolution structure that you want to use as prior information (external.pdb). Then to generate restraints for all chains in target.pdb, using all chains from external.pdb, type:
prosmart -id -p1 target.pdb -p2 external.pdb.

The "-id" keyword indicates that the structures are sequence-identical, and so it will assume that the residues are equivalent, bypassing the alignment stage and assuming equivalence of residue numbering. This should be used for identical structures. However, the "-id" keyword should not be used if the structures are non-identical in sequence. If the structures are homologous in structure but non-identical in sequence then do not use the "-id" keyword, as the generated restraints would be non-sensical and likely destructive during refinement.

You can do other things, like generate restraints using only the best-scoring chains ("-restrain_best"). However, note that it may be reasonable to use the default all-on-all chain restraint generation in some cases, since REFMAC5 currently selects only the best restraint for each atom-pair when multiple restraints are generated.

An example of a possible execution of REFMAC5 with external restraints is:

refmac5 \
    XYZIN  pdb_in.pdb  \
    HKLIN  mtz_in.mtz  \
    XYZOUT pdb_out.pdb \
    HKLOUT mtz_out.mtz \
<<EOF
NCYC 20
EXTERNAL USE MAIN
EXTERNAL DMAX 4.2
EXTERNAL WEIGHT SCALE 10
EXTERNAL WEIGHT GMWT 0.15
@prosmart_restraints_file.txt
MONI DIST 1000000
END
EOF

Description of used keywords:

`NCYC 20`	Number of REFMAC5 refinement cycles. Note that external restraints will seem to have little effect if using only few refinement cycles (e.g. 5). Something like 20-40 cycles may be required, and even more if also using jelly-body restraints (see below).
`EXTERNAL USE MAIN`	Discards any side-chain restraints that may be present in the external restraints file (remove this keyword to keep side-chains restraints; the `EXTERNAL USE ALL` keyword can be used to ensure that all restraints will be used). This may or may not be appropriate, depending on the particular case. If used, side-chain restraints will generally have a larger effect on refinement (note that the optimal weight and Geman-McClure parameter values will be very different if using side-chain restraints). Recommendations: if at early/intermediate stages of refinement, and just trying to stabilise the conformation of a poorly-defined backbone in poor density, only use main-chain restraints at first (i.e. use `EXTERNAL USE MAIN`); otherwise, try using both main and side-chain restraints (i.e. remove the REFMAC5 keyword: `EXTERNAL USE MAIN`, and make sure the ProSMART keyword: `-side` is used).
`EXTERNAL DMAX 4.2`	Maximum restraint interatomic distance. Highly recommended value: 4.2.
`EXTERNAL WEIGHT SCALE 10`	External restraints weight. Increasing this weight increases the influence of external restraints during refinement. Optimal value: varies dramatically - note that the value 10 shown here is rather arbitrary.
`EXTERNAL WEIGHT GMWT 0.15`	Geman-McClure parameter, which controls robustness to outliers. Increasing this value reduces the influence of outliers (i.e. restraints that are very different from the current interatomic distance). However, increasing this value too high results in too many restraints being considered outliers - this means that only the restraints that are very similar to the current interatomic distances will have much effect. Optimal value: varies dramatically - note that the value 0.15 shown here is rather arbitrary. Note also that the optimal value is highly correlated with the external restraints weight parameter.
`@prosmart_restraints_file.txt`	Specifies the location of the external restraints file generated by ProSMART, e.g. called `prosmart_restraints_file.txt`. This file will be located in the ProSMART output directory (by default: `ProSMART_Output/`). Only the `EXTERNAL` keywords supplied before specifying this file will be applied to the external restraints in this file.
`MONI DIST 1000000`	Only monitor distances (i.e. identify individual restraints as potential outliers in the log file) greater than this value relative to the restraint sigma (i.e. 1000000*sigma in this example). Consequently, since the default value is 10, many unnecessary extra lines will be written to the REFMAC5 log file by default when using external restraints - this is avoided by setting the value of `MONI DIST` to be arbitrarily high. Note that outliers are expected when using external restraints (and the use of the Geman-McClure robust estimator function allows true outliers to be dealt with automatically).

Appropriate values of EXTERNAL WEIGHT SCALE and EXTERNAL WEIGHT GMWT will depend on many factors, including the existing geometry weight, the resolution and quality of the data and reference structure(s), the number of NCS-restrained chains, and the local similarity between the target and external structures. Different values will generally have to be tried in order to have any chance of successfully using external restraints. For more information about external restraints, including the external restraints weight and the Geman-McClure parameter, see Nicholls et al. (2012).

Note that the "RIDGE DISTANCE SIGMA 0.1" keyword can be used to specify for jelly-body restraints (also called harmonic restraints) to be used in regions where there are no external restraints. For more information, see Murshudov et al. (2011), and for more information about the RIDGE keywords, see here.

ProSMART Command Line Arguments:

Program Options:

`-a`	run only ProSMART ALIGN, default alignment method. This uses all mainchain atoms for superposition and scoring.
`-a1`	run only ProSMART ALIGN, alternative method 1 (faster). This uses only C-alpha atoms for superposition and scoring.
`-a2`	run only ProSMART ALIGN, alternative method 2 (slower). This uses all mainchain atoms for superposition, then uses only C-alpha atoms for scoring.
`-r`	run only ProSMART RESTRAIN (assumes alignment exists in correct location).
`-o [dir path]`	directory for output files relative to current directory (default: `./ProSMART_Output/`). Directories will be created if they do not exist.
`-f [file path]`	external text file used as an alternative way to provide arguments to the program. Arguments can be space and/or newline separated.
`-threads [integer]`	max number of threads - max number of child processes to be executed simultaneously by ProSMART. For optimal performance, this value should be equal to the number of logical cores (threads) in the CPU. Alternatively, this value may be reduced in order to free cpu resources, increasing system performance at the expense of ProSMART.
`-xml`	output a log file in XML format, called `xmlout.xml`, into the current working directory.
`-xml [file path]`	output a log file in XML format, to the location specified.
`-refmac [exec. name]`	name of REFMAC binary executable (default: `refmac5`).
`-nmr`	specifies that input structure is an NMR/MD ensemble (this functionality is experimental). Only one target PDB file should be specified, and no external reference. The first state (chain) will be used as target, and all other states as secondary references.
`-merge`	allows oligomers/complexes to be considered as individual structural units by merging multiple chains within a PDB file. This is achieved by combining all (or desired) chains into a single chain; all residue numbers are renamed accordingly.

Target and Secondary Chain Selection:

`-p1 [file path]`	location of target PDB file(s) (required).
`-p2 [file path]`	location of secondary external reference PDB file(s).
`-c1 [char]`	chains of interest in PDB file(s) 1 (if unspecified, all will be used).
`-c2 [char]`	chains of interest in PDB file(s) 2 (if unspecified, all will be used).
`-id`	specifies that the chains to be aligned are sequence-identical. More specifically, assumes that the residue numbering is the same between the two structures.
`-allonall`	if only target chains are specified (i.e. `-p2` is not specified) then by default all-on-all comparison of target chains will be performed, using only the upper triangular half-matrix of chain-pair combinations. Specifying the `-allonall` argument will result in the full matrix of combinations being used.
`-align [pdb] [chain(s)] [resnum] [resnum]`	specify residue ranges, so that only portion of PDB files may be used during alignment (applied during PDB file processing - simply removes residues outside this range). The `-align` keyword may be used multiple times, in order to specify multiple ranges. The `resnum` arguments must be integer (i.e. cannot specify an insertion code - combine with the `-align_rm` keyword for more flexibility with insertion codes). Can use the string `ALL` for the `pdb` and/or `chain` arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the `chain(s)` argument, e.g. the string `ABC` would specify for the instance of `-align` to be applied to chains `A`, `B`, and `C`, if they exist, and not applied to any other chains. Note that different instances of the `-align` keyword may be used in order to filter different regions from different chains.
`-align_rm [pdb] [chain(s)] [resnum(ins)]`	specify individual residues to be removed (applied during PDB file processing - simply removes these residues from consideration prior to structural alignment). The `-align_rm` keyword may be used multiple times, in order to remove multiple residues. The `resnum` argument may optionally be appended by an insertion code. Can use the string `ALL` for the `pdb` and/or `chain` arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the `chain(s)` argument, e.g. the string `ABC` would specify for the instance of `-align_rm` to be applied to chains `A`, `B`, and `C`, if they exist, and not applied to any other chains. Note that different instances of the `-align_rm` keyword may be used in order to filter different regions from different chains.

Fragment Library Options:

`-lib`	specifies to run alignment of all fragments in the ProSMART library instead of pairwise alignment of two chains (any secondary PDB file(s) specified using `-p2` will be ignored).
`-helix`	specifies to run fragment alignment of the helix entry in the ProSMART library instead of pairwise alignment (any secondary PDB file(s) specified using `-p2` will be ignored). This is a subset of `-lib`, which uses only the representative helix fragment instead of all fragments in the library.
`-strand`	specifies to run fragment alignment of the strand entry in the ProSMART library instead of pairwise alignment (any secondary PDB file(s) specified using `-p2` will be ignored). This is a subset of `-lib`, which uses only the representative strand fragment instead of all fragments in the library.
`-library_config [file]`	provide a library configuration file, to use instead of the default `config.txt` located in the `ProSMART_Library/` directory.
`-library [dir_path]`	provide the location of a library to be used instead of the default `ProSMART_Library/`.
`-lib_score [double]`	override all Procrustes score thresholds in the fragment library (warning - will be applied to all fragments).
`-lib_fraglen [int]`	override all fragment lengths in the fragment library (warning - will be applied to all fragments).

ProSMART ALIGN Alignment Options:

`-len [integer]`	length of fragment, in residues (default `9`, but is overriden by values in the ProSMART library `config.txt` file if `-lib` is specified). Must be odd.
`-score [double]`	Procrustes score dissimilarity threshold. Can be used to remove regions from the alignment that are not locally conserved. By default, no threshold is used.
`-superpose [double]`	only fragments with Procrustes scores less than this value will be used for producing the global superposition.
`-helixcutoff [double]`	helix-sharing is used to help align pairs of structures. This value is the cutoff for defining helix similarity at the chosen fragment length (note: this has nothing to do with helix alignment).
`-helixpenalty [double]`	helix-sharing is used to help align pairs of structures. This value is the dynamic alignment penalty for pairs of helix fragments (note: this has nothing to do with helix alignment).
`-no_reward_seq`	specifies to perform pure structure-based alignment, ensuring that sequence conservation has no influence.
`-reward_seq [double]`	fragment-pairs whose corresponding residues are all sequence-identical are assigned a fixed dissimilarity score (which may be negative) instead of the ordinary Procrustes score during alignment. This effectively forces the alignment to obey sequence conservation, for structure-pairs with very high sequence identity.
`-skip_refine`	specifies that fragment-based alignment refinement and residue-based alignment optimisation should not be performed, taking the raw alignment resulting from dynamic programming.
`-main_dist`	specifies for all main chain atoms (N, CA, C and O) to contribute to side chain scores. Only CA is included by default.
`-num_dist [double]`	interatomic distance threshold for the 'NumDist' score, which is the number of corresponding side chain atoms that deviate more than the threshold, after local backbone superposition.
`-no_fix_errors`	specifies to not account for potential side chain flips during scoring. By default, flips are applied in order to achieve the lowest net interatomic distance after local superposition.

ProSMART ALIGN Rigid Substructure Identification Options:

`-cluster_skip`	do not perform the rigid substructure identification functionality.
`-cluster_score [double]`	Procrustes dissimilarity score threshold; only fragments with scores below this value will be used for rigid substructure identification.
`-cluster_angle [double]`	intrafragment rotational dissimilarity score threshold (unit: angle in degrees); only fragments with scores below this value will be used for rigid substructure identification.
`-cluster_min [double]`	minimum number of fragments (size of cluster) when performing rigid substructure identification.
`-cluster_link [double]`	single linkage clustering threshold for rigid substructure identification (unit: cosine of angle).
`-cluster_rigid [double]`	controls final cluster rigidity of identified rigid substructures (unit: cosine of angle). Increasing this value will force the superposition to agree better with fewer fragments in the centre of the cluster, rather than being based on agreement with the whole substructure. Higher values are better for superposition, although may result in rigid substructures not being identified at all.
`-cluster_color [double]`	scales cluster colour resolution (i.e. dissimilarity threshold) in outputted PyMOL colour scripts.
`-output_dm`	output cluster distance matrices, for subsequent inspection using other software (e.g. R).

ProSMART ALIGN Output Options:

`-cosine`	display intrafragment rotation scores as a cosine distance ( = 1 - cos(theta) ), rather than as the default angle (degrees).
`-quick`	do not output superposed PDB files and PyMOL colour code scripts.
`-out_pdb`	output superposed PDB files.
`-out_pdb_full`	any output superposed PDB files will comprise all chains, rather than just the particular chain of interest (i.e. the one aligned and superposed).
`-out_color`	output PyMOL colour code scripts.
`-colour_score [double]`	adjusts the color resolution in the output PyMOL color files corresponding to main chain scores.
`-side_score [double]`	adjusts the color resolution in the output PyMOL color files corresponding to side chain scores.
`-col1 [d. d. d.]`	RGB colour code for used for defining similarity.
`-col2 [d. d. d.]`	RGB colour code for used for defining dissimilarity.

ProSMART RESTRAIN Options:

`-self_restrain`	generates generic self-restraints for the primary structures, ignoring any secondary input structures. This feature is generalised and may be applied to any molecules, e.g. can be used for DNA/RNA.
`-restrain_all`	generates restraints for all target chains, using all secondary chains as external reference structures.
`-restrain_best`	only use the best external reference chain. For each of the target chains, only one of the external reference chains is used for restraint generation. Which chain is the best is determined by the global alignment score, which is based on net agreement of local structure, independent of global conformation.
`-restrain_to_self`	by default, self-restraints are not generated (i.e. restraints for pdb1.pdb chain A will not be generated using pdb1.pdb chain A as an external reference). This parameter allows self-restraints to be generated.
`-rmax [double]`	max restraint distance - size of sphere around atom in which restraints can exist.
`-rmin [double]`	min restraint distance - restraints of length below this value are not included.
`-sigma [double]`	default sigma used to weight atomic distance restraints in external refinement (only if sigma parameter estimation is not used, or fails to find a suitable solution).
`-min_sigma [double]`	minimum possible value of sigma, which overrides sigma parameter estimation where appropriate.
`-sigmatype [integer]`	possible values are `0`, `1` or `2`. Description: `0`: use default constant sigma. `1`: estimate constant sigma by maximum likelihood estimation. `2`: generate distance-dependent sigmas by maximum likelihood estimation.
`-cutoff [double]`	alignment score cutoff for restraints (default `10.0`, which effectively turns this feature off) - filters out restraints corresponding to residues with worse Procrustes scores.
`-side_cutoff [double]`	"side chain average" score cutoff for restraints (default `10.0`, which effectively turns this feature off) - filters out restraints corresponding to residues with worse scores.
`-multiplier [double]`	value (default `10.0`, which effectively turns this feature off) that is multiplied by the restraint sigma in order to determine a cutoff value with which to filter restraints that are too far from the original distance. This effectively removes extreme outliers.
`-weight [double]`	scales sigmas so that restraints have different weighting (default `1.0`). The lower the value, the higher the influence that the restraints have during refinement.
`-rm_bonds`	specifies that restraints on bonds/angles will be removed (REFMAC5 may be executed to generate list of bonded atom-pairs, if required). If bonds/angles are not removed, then default sigmas will be used, since the assumptions for estimated sigmas will be violated. Specification to use estimated sigmas will by default cause bonds/angles to be removed.
`-main`	specifies that only restraints for main chain atoms should be generated; side chain atoms will be ignored.
`-side`	specifies that restraints for both main chain and side chain atoms should be generated.
`-restrain [pdb] [chain(s)] [resnum] [resnum]`	specify residue ranges, so that restraints are only generated for a portion of a structure (similar to the `-align` keyword, only the filtering is applied at the end of the process after structural alignment - applies a filter to final restraints lists). The `-restrain` keyword may be used multiple times, in order to specify multiple ranges. The `resnum` arguments must be integer (i.e. cannot specify an insertion code - combine with the `-restrain_rm` keyword for more flexibility with insertion codes). Can use the string `ALL` for the `pdb` and/or `chain` arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the `chain(s)` argument, e.g. the string `ABC` would specify for the instance of `-restrain` to be applied to chains `A`, `B`, and `C`, if they exist, and not applied to any other chains. Note that different instances of the `-restrain` keyword may be used in order to filter different regions from different chains.
`-restrain_rm [pdb] [chain(s)] [resnum(ins)]`	specify individual residues to be removed (similar to the `-align_rm` keyword, only the filtering is applied at the end of the process after structural alignment - simply removes these residues from the final restraints lists). The `-restrain_rm` keyword may be used multiple times, in order to remove multiple residues. The `resnum` argument may optionally be appended by an insertion code. Can use the string `ALL` for the `pdb` and/or `chain` arguments, so that the filtering is applied to all files/chains, as desired. Multiple chains can be specified using the `chain(s)` argument, e.g. the string `ABC` would specify for the instance of `-restrain_rm` to be applied to chains `A`, `B`, and `C`, if they exist, and not applied to any other chains. Note that different instances of the `-restrain_rm` keyword may be used in order to filter different regions from different chains.
`-output_pdb_chain_restraints`	Further to chain-chain restraints files and the final pdb-all restraints files, also output the intermediate pdb-chain restraints files. This is disabled by default, to save disk space.
`-no_copy`	don't copy final restraints files to main output dir.
`-type [int]`	Specifies the REFMAC5 restraint type (as specified here). Possible values: `0` (bond type restraints - replace existing), `1` (bond type restraints - add to existing), and `2` (external restraints - default, recommended).

ProSMART RESTRAIN Generic Bond Restraints (default: Secondary-Structure H-Bond Restraints):
NOTE: if -h is specified then general main-chain h-bond restraints, including those for helices and sheets, will be generated together. In contrast, if -h_helix and -h_sheet are both specified then restraints for helices and sheets will be generated separately.
NOTE: if generating generic bonds, cannot also use keywords: -p2, -c2, -lib, -helix, -strand.
NOTE: fragment library alignment info will only be used if the -strict keyword is used.

`-h` (or `-bond`)	generate generic bond restraints. By default, these restraints represent h-bonds, and are generated for the whole main chain, including all helices, sheets, loops, etc. according to detected hydrogen bonding patterns.
`-h_helix`	generate generic bond restraints for helices. This includes all types of helices (not just alpha). Types of helix restraints can be specified using keywords `-3_10`, `-alpha`, or `-pi`, or alternatively by manually specifying the residue separation between restrained atom-pairs (see keywords: `-min_sep`, `-max_sep`, `-allow_sep`, `-rm_sep`, `-bond_opt`).
`-h_sheet`	generate generic bond restraints across beta-sheets.
`-strict`	require strict structural conservation to helix/strand conformations in order for generic restraints to be generated for those regions. Uses fragment library to determine which regions are sufficiently helical/strand-like.
`-3_10`	generate restraints for potential 3_10-helices. Specifically, requires residue separation of 3 residues.
`-alpha`	generate restraints for potential alpha-helices. Specifically, requires residue separation of 4 residues.
`-pi`	generate restraints for potential pi-helices. Specifically, requires residue separation of 5 residues.
`-bond_dist [double]`	target value of the generic bond restraints.
`-bond_min [double]`	minimum interatomic distance for restrained atom-pairs.
`-bond_max [double]`	maximum interatomic distance for restrained atom-pairs.
`-min_sep [int]`	minimum number of residues between restrained atom-pairs.
`-max_sep [int]`	maximum number of residues between restrained atom-pairs.
`-allow_sep [int]...`	specify to allow only specific number(s) of residues between restrained atom-pairs.
`-rm_sep [int]...`	specify to disallow specific number(s) of residues between restrained atom-pairs.
`-bond_opt [int]`	controls how atom-pairs are selected, i.e. which atom-types can form bonds. Possible values: `1` (only O-N restraints - suitable for all helices), `2` (both O-N and N-O restraints - suitable for beta-sheets), `3` (do not filter - allow generic bond restraints between any main chain atoms).
`-bond_override [int]`	overrides the default number of allowed bonds per atom. Default is 1 for N, 2 for O.

Troubleshooting Options: (display extra info in logfiles)

`-troubleshoot_restraint_files`
`-troubleshoot_hbond_restraints`

Other Options:

-renamechain [char] only for use in special cases where the target PDB file (specified with -p1) has only one chain, but does not have a chain_ID letter specified. This option can be used to regenerate the PDB file with a chain_ID as specified by the user. Other ProSMART functionalities will not be executed if -renamechain is specified. ProSMART should then be re-executed with the newly created PDB file.

Further Notes On Methodology, Functionality And Implementation:

If alternative residue conformations are detected, only the first conformation present is used for alignment and scoring.
Alignments achieved using ProSMART are forced to maintain order of sequence.
If ProSMART is executed more than once with the same PDBs/chains, then files generated during previous executions will be overwritten, even if some of the command line arguments are different. If it is not desired for files to be overwritten, then it is recommended for different output directories to be used (this can be achieved using the -o argument).
When performing more than one alignment in a single ProSMART execution, the pairwise alignments are by default executed in parallel as multiple jobs in order to utilise the multi-threading capabilities of modern multi-core processors (preferences can be set using the -threads argument). This allows a dramatic increase in the usage of system resources and processing power. Consequently, specifying for multiple chain alignments to be executed concurrently is often much quicker than performing single pairwise alignments consecutively (performance scales approximately linearly with the number of physical cores in the cpu, and with cpu frequency). Furthermore, if both alignment and restraint generation are performed in the same execution, and atomic bonds are to be generated, then the generation of atomic bonds (using REFMAC5) and alignment of structures will be performed in parallel in order to increase efficiency.

References and Links:

ProSMART:
- Home page: www2.mrc-lmb.cam.ac.uk/groups/murshudov/
- Use of external restraints generated by ProSMART:
  R.A. Nicholls, F. Long and G.N. Murshudov (2012) Low Resolution Refinement Tools in REFMAC5. Acta Cryst. D.
- Details of methods used by ProSMART:
  R.A. Nicholls (2011) Conformation-Independent Comparison of Protein Structures. University of York (Ph.D. thesis) [Alternative Link].

REFMAC5:
- Home page: www2.mrc-lmb.cam.ac.uk/groups/murshudov/ (old: www.ysbl.york.ac.uk/~garib/refmac/)
- G.N. Murshudov, P. Skubak, A.A. Lebedev, N.S. Pannu, R.A. Steiner, R.A. Nicholls, M.D. Winn, F. Long and A.A. Vagin (2011) REFMAC5 for the Refinement of Macromolecular Crystal Structures. Acta Cryst. D67, 355-367.
- G.N. Murshudov, A.A. Vagin and E.J. Dodson (1997) Refinement of Macromolecular Structures by the Maximum-Likelihood Method. Acta Cryst. D53, 240-255.

CCP4:
- CCP4 home page: www.ccp4.ac.uk/
- Collaborative Computational Project, Number 4 (1994) The CCP4 Suite: Programs for Protein Crystallography. Acta Cryst. D50, 760-763.

CCP4mg:
- CCP4mg home page: www.ccp4.ac.uk/MG/ (latest versions include experimental ProSMART analysis features)
- CCP4mg nightly builds: www.ysbl.york.ac.uk/~ccp4mg/nightly/
- S. McNicholas, E. Potterton, K.S. Wilson, and M.E.M. Noble (2011) Presenting your structures: the CCP4mg molecular-graphics software. Acta Cryst. D67, 386-394.

PyMOL:
- PyMOL home page: www.PyMOL.org/
- The PyMOL Molecular Graphics System, Version 1.3, Schr�dinger, LLC.