cprodrg XYZIN foo_in.pdb LIBOUT foo_out.cif XYZOUT foo_out.pdb MOLOUT foo_out.mol
[Keyworded input]
Crystallographic refinement and automated model-building require detailed descriptions of the chemical entities being refined, including information about atom types and connectivity (topology) and chemical parameters (bond lengths, angles, etc.). The CCP4 monomer library includes such descriptions for standard amino acids, nucleic acids and a variety of common small molecules, but cannot cover all possible chemicals you might want to include in a structure. The main purpose of PRODRG is to automatically generate such topological/parameter information for arbitrary small molecules (with some LIMITATIONS) from user input.
In addition, PRODRG can also generate reasonable three-dimensional coordinates if none are available or optionally improve the input conformation if they are. The resulting coordinates are written out in PDB and/or MDL Molfile format. Please note that the PRODRG-generated topology must be used in conjunction with the PRODRG-generated PDB file rather than with the input file, even if that file is in PDB format and no modifications had been requested. The main reason for this is that PRODRG may rename atoms in your input, making it incompatible with the generated topology.
Thirdly, PRODRG can be used to create 'flattened' two-dimensional coordinates of a molecule, useful for showing its chemical structure (see MINImise).
Finally, PRODRG allows the alteration of given compounds by adding functional groups to them or removing/replacing parts of the molecule. This is described in more detail under MODIFYING THE INPUT.
Description of the desired small molecule in one of the valid PRODRG input formats:
PRODRG will automatically recognise the format of the input file. Note that for SMILES input to be recognised, it must not contain any line breaks. In general, non-PDB input formats should be preferred if available, as they provide more detailed information about the requested molecule to PRODRG.
CIF-formatted topology for the given small molecule generated by PRODRG.
Coordinates for the given molecule in PDB format generated by PRODRG. Even if you provided your input to PRODRG as a PDB file and did not request energy minimisation, you should always use this file together with the generated topology, as PRODRG may have had to rename atoms in your input, which would render its topology incompatible with the original coordinate file.
Coordinates for the given molecule in MDL Molfile format generated by PRODRG.
PRODRG's operation can be controlled by keywords given on standard input. There are no compulsory keywords, the defaults for each keyword are indicated below. In all cases only the first four characters of a keyword are significant. Recognised keywords are: COORds, PROTonate, MINImise, CHIRality, END.
Write output coordinates in PDB format only (COOR PDB, default), in MDL Molfile format only (COOR MOL) or in both formats (COOR BOTH).
By default, PRODRG will write out coordinate files including all hydrogen atoms (PROT ALL). Alternatively, coordinate files containing only polar hydrogens (PROT POLAR) or no hydrogens at all (PROT NONE) can be written. The CIF topology will always include all hydrogen atoms.
This keyword controls PRODRG's coordinate generation feature. If disabled (MINI NO), output atom coordinates will be identical to input coordinates, or random if the input does not contain information about atom positions (SMILES etc.). This option should be chosen when the position/conformation of the input molecule must be conserved, e.g. because it has already been manually placed into a model, as any other minimisation choice may alter the conformation and/or position of the given molecule. Enabling minimisation (MINI YES) will either attempt to improve the input conformation or, if there are no input coordinates, generate a reasonable conformation from scratch. MINI BUILD (the default) is equivalent to MINI NO, unless the input contains building commands (see MODIFYING THE INPUT), in which case minimisation is enabled. MINI FLAT enables minimisation, but instead of producing 3D coordinates, the output file will contain 'flattened' 2D coordinates.
Directs the use of chirality (=improper) restraints. The default is to restrain chiral centres (CHIR YES). CHIR NO disables all chirality restraints, which may be useful when building things like fullerenes starting from a non-coordinate description. CHIR INPUT applies chirality restraints only if the input file specifies stereochemistry (3D coordinates, wedged bonds, etc.), but not otherwise.
This terminates keyworded input. If an explicit END keyword is not given, PRODRG will stop reading keywords at the end of input.
PRODRG is designed to be used on small molecules and as such comes with a default size limit of 600 atoms, including hydrogens after processing. The limit can be changed by editing the MA constant in $CSRC/prodrg/params.inc
and recompiling, but be aware that memory usage increases quadratically with the number of atoms, which means that e.g. creating a topology for your entire protein will be impossible on current hardware (aside from being a really bad idea).
Furthermore, PRODRG's support for atom types is limited by the underlying force field (a modified version of GROMOS96 43a1), which means, amongst other things, that there is no support for metal ions/atoms, either by themselves or as part of organic compounds.
PRODRG can accept input molecules as text-based 'drawings' of chemical structures, using chemical symbols to place atoms and separators to indicate bonds (- and | for single bonds, = and " for double bonds and # for triple bonds). A few simple examples of PRODRG text drawings are:
Formate
O-C=O
Acetonitrile
C-C#N
Benzene
C-C=C " | C-C=C
Adenine
N | N=C-C---N | " " C=N-C-N-C
Lowercase atom names can be used to change the chirality of that atom, for non-chiral centres there is no difference between uppercase and lowercase symbols.
All atoms must be separated by bonds, i.e. C-C
describes a two carbons connected by a single bond, while CC
is invalid. Bonds can be of arbitrary length (C--C
is the same as C-C
) and for single and double bonds, choice of either of the two valid bond symbols is purely cosmetic (C|C
or even C||-C
look strange, but are identical to C-C
), as long as different bond types are not mixed (e.g. C-=-C
is nonsensical and invalid). Because of the interchangeability of horizontal and vertical bond symbols, all bonds must be separated by at least one space (i.e. ||
is the same as --
and thus part of one bond, while | |
shows parts of two separate bonds). Bonds can connect to atoms from above, below, left or right – diagonal connections are not allowed.
Based on this, another (needlessly complicated but valid) way to depict adenine could be:
N====C-C-N | | " " |--C N " C " " | N---C-N
PRODRG supports a number of commands that allow to modify the input molecule. These commands must be added to the input file. It should be noted that most of these commands refer to atoms by name, which can make their use in conjunction with input formats not using atom names (SMILES, text drawings, ...) awkward. In these cases PRODRG should be run without the modifying command(s) first to see what names the program assigns to atoms of interest, then PRODRG can be run again on the same input as before with the commands in place. The same procedure applies in cases where PRODRG changes atom names while processing a molecule.
The PATCH
command allows to change the hybridisation of an atom. While this is mostly meant to be a means for the user to help PRODRG interpret low-quality input, it can also be used to introduce double bonds etc., as long as care is taken that the result of the patches applied makes chemical sense.
PATCH <atomname> 1 PATCH <atomname> 2 PATCH <atomname> 3
can be used to force the given atom to sp, sp2 or sp3 hybridisation, respectively. In the case of sp hybridisation a further distinction should be made between sp-hybridised atoms as part of triple bonds or sp-hybridised atoms in allene-like systems. For improved results, PATCH <atomname> 10
should be used for the latter, and PATCH <atomname> 1
should be used for the former only.
The chirality of any atom can be inverted with
PATCH <atomname> -1
To generate output in specific 'non-standard' protonation states, the two commands
INSHYD <atomname> DELHYD <atomname>
can be used to add/remove hydrogens. Note that for both commands the specified atom is the one the hydrogen is attached to, i.e. not the hydrogen itself in the case of DELHYD
.
The chemical identity of an input atom can be altered with
BUILD <atomname> @<type>
where <type> is the chemical symbol of the target type. As an example, a standard serine residue could be 'mutated' to a cysteine using
BUILD OG @S
Bonds in the input molecule can be cut with the command
CHOP <atomname1> <atomname2>
where the two given atoms are connected by the bond to be removed. If the cutting produces two separate molecules, the smaller part is deleted. You can use the additional command
KEPSML
to delete the larger part instead. Again using a standard serine residue as an example
CHOP CB OG
could be used to turn it into an alanine.
PRODRG can also be used to add new atoms to existing molecules using the command
BUILD <atomname> <fragmentname>
which attaches a new 'fragment' to the specified atom. Examples of fragments are ME
for a methyl group, OH
for a hydroxyl group or PHI
to add a phenyl. A complete list of fragments can be found here. If the addition of a new group creates a chiral centre at the attachment point, its chirality can be changed by prefixing the atom name with a tilde character (~
).
Some default fragments have two attachment points, e.g. the 2-EPOXY
fragment used to introduce an epoxy bridge between two atoms or the CONECT
fragment that does not actually add anything but simply connects two existing atoms. For these fragments, the second attachment point must be specified after the fragment name, i.e.
BUILD <atomname1> <fragmentname> <atomname2>
You can change the residue name with the command
CPNAME <newname>
This can be used to avoid the default 'DRG' when creating compounds from scratch or to update the name to something more appropriate during building.
A. W. Schuettelkopf and D. M. F. van Aalten (2004). PRODRG – a tool for high-throughput crystallography of protein-ligand complexes. Acta Crystallogr D60, 1355–1363.
Alexander W. Schuettelkopf and Daan M. F. van Aalten, Division of Molecular Microbiology, College of Life Sciences, University of Dundee
cprodrg XYZIN ligand.pdb XYZOUT ligand_use.pdb LIBOUT ligand_use.cif <<EOF END EOF
echo C-C-C-O > temp.draw cprodrg XYZIN temp.draw XYZOUT ligand_use.pdb LIBOUT ligand_use.cif <<EOF MINI YES EOF rm temp.draw
echo BUILD NZ ME >> lys.pdb echo CPNAME MLY >> lys.pdb cprodrg XYZIN lys.pdb XYZOUT mly.pdb LIBOUT mly.cif <<EOF END EOF
cprodrg XYZIN ligand.pdb XYZOUT plotme.mol LIBOUT /dev/null <<EOF COOR MOL PROT NONE MINI FLAT END EOF