mon_lib - multi-purpose dictionary for macromolecules
This dictionary can be use for several purposes: refinement, graphics, validation. The extended mmCIF format makes the dictionary self-explanatory, easy to adapt and to add new information.
The LIBCHECK program was developed to manage and check the information in the dictionary, and can also create new dictionary entries from different sources: PDB, CDS, CIF, SMILE.
MON_LIB defines:
All this information is constant, i.e. independent of the conformation of the molecule. A CIFile, describing a macromolecule, must contain the variable information (coordinates, occupancy, B-factors) and the list of modifications to and links between actual monomers.
The information about the names of chains and monomers, and the serial numbers of monomers for the links must be present in the CIFile or PDB file of coordinates.
The values for the amino acid bond lengths
and angles have been taken from Engh and Huber,
Acta Cryst. A47, 392-400 (1991).
The values for the purine and pyrimidine bond lengths
and angles have been taken from O. Kennard & R. Taylor (1982),
J. Am. Soc. Chem. vol. 104, pp. 3209-3212.
The values for the sugar-phosphate backbone bond lengths
and bond angles have been taken from the W. Saenger's "Principles
of Nucleic Acid Structure" (1983), Springer-Verlag, pp. 70,86.
Definition:
A monomer is a set of atoms connected by bonds, or a single atom.
For example, it may be an amino acid, a polypeptide chain, or a polypeptide chain and a substrate connected by hydrogen bonds.
It is useful for some programs (dynamics, graphics and so on) to define a tree-like structure of the monomers. The specification for each atom of the monomer of a 'back atom' and a 'forward atom' defines the tree-like structure of the monomer. The 'back atom' of a given atom is its preceding atom in the tree. If an atom has some forward branches, their order is
The following categories describe the monomers:
Let the vector direction atom_1 to atom_2 be v1,
the vector direction atom_1 to atom_3 be v2,
the vector direction atom_1 to atom_4 be v3;
the chiral volume is the volume of the parallelepiped
formed by the three vectors: v1,v2,v3.
VOLUME = v1 . [ v2 x v3 ]
The type is the sign of the chiral volumeCategories to describe the type of links between two atoms of different monomers. They describe only the type of the link, the atom names and the monomer flags. The information about the names of the chains and monomers, and the serial numbers of the monomers is given in the CIFile.
The monomer flag designates the first or the second monomer in the category _entity_link_ having its value given in the CIFile.
_entity_link_id
_entity_link_entity_id_1
_entity_link_mon_id_1
_entity_link_set_num_1
_entity_link_entity_id_2
_entity_link_mon_id_2
_entity_link_set_num_2
SS Ach CYS 20 Cch CYS 40
The link "TRANS" is the default for polypeptide chain, "pd", the default for DNA, "pr", the default for RNA.
Categories to describe the possible modifications of monomers: the 'function' (add, delete, change), and the atoms, bonds, angles, chirality, planarity that will be modified. The information about the names and serial numbers of monomers is described in the CIFile. For example:
_entity_mod_id
_entity_mod_entity_id
_entity_mod_mon_id
_entity_mod_set_num
COO Cchain CYS 30
The modifications "NH3" and "COO" are the default for
the polypeptide chain termini.
The modifications "d5*END" and "d3*END" are the default for
DNA termini.
The modifications "r5*END" and "r3*END" are default for
RNA termini.
--- LIST OF MONOMERS --- data_comp_list loop_ _chem_comp.id _chem_comp.three_letter_code _chem_comp.name _chem_comp.group _chem_comp.number_atoms_all _chem_comp.number_atoms_nh . . . . CYS CYS 'CYSTINE ' L-peptide 10 6 . . . . --- DESCRIPTION OF MONOMERS --- data_comp_CYS loop_ _chem_comp_atom.comp_id _chem_comp_atom.atom_id _chem_comp_atom.type_symbol _chem_comp_atom.type_energy _chem_comp_atom.partial_charge CYS N N NH1 -0.204 CYS H H HNH1 0.204 CYS CA C CH1 0.058 CYS HA H HCH1 0.046 CYS CB C CH2 -0.096 CYS HB1 H HCH2 0.046 CYS HB2 H HCH2 0.058 CYS SG S S 0.004 CYS C C C 0.318 CYS O O O -0.422 loop_ _chem_comp_tree.comp_id _chem_comp_tree.atom_id _chem_comp_tree.atom_back _chem_comp_tree.atom_forward _chem_comp_tree.connect_type CYS N n/a CA START CYS H N . . CYS CA N C . CYS HA CA . . CYS CB CA SG . CYS HB1 CB . . CYS HB2 CB . . CYS SG CB . . CYS C CA . END CYS O C . . loop_ _chem_comp_bond.comp_id _chem_comp_bond.atom_id_1 _chem_comp_bond.atom_id_2 _chem_comp_bond.type _chem_comp_bond.value_dist _chem_comp_bond.value_dist_esd CYS N H coval 0.860 0.020 CYS N CA coval 1.458 0.019 CYS CA HA coval 0.980 0.020 CYS CA CB coval 1.530 0.020 CYS CB HB1 coval 0.970 0.020 CYS CB HB2 coval 0.970 0.020 CYS CB SG coval 1.808 0.023 CYS CA C coval 1.525 0.021 CYS C O coval 1.231 0.020 loop_ _chem_comp_angle.comp_id _chem_comp_angle.atom_id_1 _chem_comp_angle.atom_id_2 _chem_comp_angle.atom_id_3 _chem_comp_angle.value_angle _chem_comp_angle.value_angle_esd CYS H N CA 114.000 3.000 CYS HA CA CB 109.000 3.000 CYS CB CA C 110.100 1.900 CYS HA CA C 109.000 3.000 CYS N CA HA 110.000 3.000 CYS N CA CB 110.500 1.700 CYS HB1 CB HB2 110.000 3.000 CYS HB2 CB SG 108.000 3.000 CYS HB1 CB SG 108.000 3.000 CYS CA CB HB1 109.000 3.000 CYS CA CB HB2 109.000 3.000 CYS CA CB SG 114.400 2.300 CYS N CA C 111.200 2.800 CYS CA C O 120.800 1.700 loop_ _chem_comp_tor.comp_id _chem_comp_tor.id _chem_comp_tor.atom_id_1 _chem_comp_tor.atom_id_2 _chem_comp_tor.atom_id_3 _chem_comp_tor.atom_id_4 _chem_comp_tor.value_angle _chem_comp_tor.value_angle_esd _chem_comp_tor.period CYS chi1 N CA CB SG 0.000 15.000 3 loop_ _chem_comp_chir.comp_id _chem_comp_chir.id _chem_comp_chir.atom_id_centre _chem_comp_chir.atom_id_1 _chem_comp_chir.atom_id_2 _chem_comp_chir.atom_id_3 _chem_comp_chir.volume_sign CYS chir_01 CA N CB C negativ
--- LIST OF MODIFICATIONS --- data_mod_list loop_ _chem_mod.id _chem_mod.name _chem_mod.comp_id _chem_mod.group_id . . . . COO COO-terminus . peptide . . . . --- DESCRIPTION OF MODIFICATIONS --- data_mod_COO loop_ _chem_mod_atom.mod_id _chem_mod_atom.function _chem_mod_atom.atom_id _chem_mod_atom.new_atom_id _chem_mod_atom.new_type_symbol _chem_mod_atom.new_type_energy _chem_mod_atom.new_partial_charge COO change C C . C 0.340 COO change O O . OC -0.350 COO add . OXT O OC -0.350 loop_ _chem_mod_tree.mod_id _chem_mod_tree.function _chem_mod_tree.atom_id _chem_mod_tree.atom_back _chem_mod_tree.atom_forward _chem_mod_tree.connect_type COO add OXT C . END COO change C . OXT . loop_ _chem_mod_bond.mod_id _chem_mod_bond.function _chem_mod_bond.atom_id_1 _chem_mod_bond.atom_id_2 _chem_mod_bond.new_type _chem_mod_bond.new_value_dist _chem_mod_bond.new_value_dist_esd COO change C O coval 1.231 0.020 COO add C OXT coval 1.231 0.020 loop_ _chem_mod_angle.mod_id _chem_mod_angle.function _chem_mod_angle.atom_id_1 _chem_mod_angle.atom_id_2 _chem_mod_angle.atom_id_3 _chem_mod_angle.new_value_angle _chem_mod_angle.new_value_angle_esd COO change CA C O 121.000 3.000 COO add CA C OXT 121.000 3.000 loop_ _chem_mod_tor.mod_id _chem_mod_tor.function _chem_mod_tor.id _chem_mod_tor.atom_id_1 _chem_mod_tor.atom_id_2 _chem_mod_tor.atom_id_3 _chem_mod_tor.atom_id_4 _chem_mod_tor.new_value_angle _chem_mod_tor.new_value_angle_esd _chem_mod_tor.new_period COO add psi N CA C OXT 160.00 30.0 2 loop_ _chem_mod_plane_atom.mod_id _chem_mod_plane_atom.function _chem_mod_plane_atom.plane_id _chem_mod_plane_atom.atom_id _chem_mod_plane_atom.new_dist_esd COO add oxt C 0.020 COO add oxt CA 0.020 COO add oxt O 0.020 COO add oxt OXT 0.020
--- LIST OF LINKS --- data_link_list loop_ _chem_link.id _chem_link.name _chem_link.comp_id_1 _chem_link.mod_id_1 _chem_link.group_comp_1 _chem_link.comp_id_2 _chem_link.mod_id_2 _chem_link.group_comp_2 . . . . SS SS-bridge CYS . . CYS . . TRANS default-peptide-link . . peptide . . peptide . . . . --- DESCRIPTION OF LINKS --- data_link_SS loop_ _chem_link_bond.link_id _chem_link_bond.atom_1_comp_id _chem_link_bond.atom_id_1 _chem_link_bond.atom_2_comp_id _chem_link_bond.atom_id_2 _chem_link_bond.type _chem_link_bond.value_dist _chem_link_bond.value_dist_esd SS 1 SG 2 SG disulf 2.031 0.020 loop_ _chem_link_angle.link_id _chem_link_angle.atom_1_comp_id _chem_link_angle.atom_id_1 _chem_link_angle.atom_2_comp_id _chem_link_angle.atom_id_2 _chem_link_angle.atom_3_comp_id _chem_link_angle.atom_id_3 _chem_link_angle.value_angle _chem_link_angle.value_angle_esd SS 1 CB 1 SG 2 SG 110.000 3.000 SS 1 SG 2 SG 2 CB 110.000 3.000 loop_ _chem_link_tor.link_id _chem_link_tor.id _chem_link_tor.atom_1_comp_id _chem_link_tor.atom_id_1 _chem_link_tor.atom_2_comp_id _chem_link_tor.atom_id_2 _chem_link_tor.atom_3_comp_id _chem_link_tor.atom_id_3 _chem_link_tor.atom_4_comp_id _chem_link_tor.atom_id_4 _chem_link_tor.value_angle _chem_link_tor.value_angle_esd _chem_link_tor.period SS ss 1 CB 1 SG 2 SG 2 CB 90.00 10.0 2 data_link_TRANS loop_ _chem_link_bond.link_id _chem_link_bond.atom_1_comp_id _chem_link_bond.atom_id_1 _chem_link_bond.atom_2_comp_id _chem_link_bond.atom_id_2 _chem_link_bond.type _chem_link_bond.value_dist _chem_link_bond.value_dist_esd TRANS 1 C 2 N coval 1.329 0.014 loop_ _chem_link_angle.link_id _chem_link_angle.atom_1_comp_id _chem_link_angle.atom_id_1 _chem_link_angle.atom_2_comp_id _chem_link_angle.atom_id_2 _chem_link_angle.atom_3_comp_id _chem_link_angle.atom_id_3 _chem_link_angle.value_angle _chem_link_angle.value_angle_esd TRANS 1 O 1 C 2 N 123.000 1.600 TRANS 1 CA 1 C 2 N 116.200 2.000 TRANS 1 C 2 N 2 H 124.300 3.000 TRANS 1 C 2 N 2 CA 121.700 1.800 loop_ _chem_link_tor.link_id _chem_link_tor.id _chem_link_tor.atom_1_comp_id _chem_link_tor.atom_id_1 _chem_link_tor.atom_2_comp_id _chem_link_tor.atom_id_2 _chem_link_tor.atom_3_comp_id _chem_link_tor.atom_id_3 _chem_link_tor.atom_4_comp_id _chem_link_tor.atom_id_4 _chem_link_tor.value_angle _chem_link_tor.value_angle_esd _chem_link_tor.period TRANS psi 1 N 1 CA 1 C 2 N 160.00 30.0 2 TRANS omega 1 CA 1 C 2 N 2 CA 180.00 10.0 0 TRANS . 1 CA 1 C 2 N 2 H 0.00 10.0 0 TRANS phi 1 C 2 N 2 CA 2 C 60.00 20.0 3 loop_ _chem_link_plane.link_id _chem_link_plane.plane_id _chem_link_plane.atom_comp_id _chem_link_plane.atom_id _chem_link_plane.dist_esd TRANS plane1 1 CA 0.02 TRANS plane1 1 C 0.02 TRANS plane1 1 O 0.02 TRANS plane1 2 N 0.02 TRANS plane2 1 C 0.02 TRANS plane2 2 N 0.02 TRANS plane2 2 CA 0.02 TRANS plane2 2 H 0.02
---------------------------------------------------
ener_lib.cif 4-APR-95
---------------------------------------------------
----------- Description of atom type --------
HEADER C Carbon PI
ATOMTYPE CSP = with triple bond
HEADER C Carbon SP2
ATOMTYPE C = without hydrogen ( carbonyl C )
ATOMTYPE C1 = connected to 1 hydrogen
ATOMTYPE C2 = connected to 2 hydrogens
ATOMTYPE CR1 = between two pyrrole units
ATOMTYPE CR1H = CR1 connected to 1 hydrogen ( CHA of HEME )
ATOMTYPE CR15 = connected to 1 hydrogen in 5 atoms ring ( CE1 of HIS)
ATOMTYPE CR16 = connected to 1 hydrogen in 6 atoms ring ( CE1 of PHE)
ATOMTYPE CR6 = without hydrogen in 6 atoms ring
ATOMTYPE CR5 = without hydrogen in 5 atoms ring
ATOMTYPE CR56 = between two atoms in 5-6 rings ( CD2 CE2 of TRP )
ATOMTYPE CR55 = between two atoms in 5-5 rings
ATOMTYPE CR66 = between two atoms in 6-6 rings
HEADER C Carbon SP3
ATOMTYPE CH1 = connected to 1 hydrogen ( CA of most amono acids )
ATOMTYPE CH2 = connected to 2 hydrogens ( CB of most amono acids)
ATOMTYPE CH3 = connected to 3 hydrogens ( CD1 CD2 of LEUCINE)
ATOMTYPE CT = without hydrogen
HEADER H Hydrogen
ATOMTYPE HCH = hydrogen of aliphatic group
ATOMTYPE HCR = hydrogen of aromatic group
ATOMTYPE HNC1 = hydrogen connected to NC1
ATOMTYPE HNC2 = hydrogen connected to NC2
ATOMTYPE HNC3 = hydrogen connected to NC3
ATOMTYPE HNH1 = hydrogen connected to NH1
ATOMTYPE HNH2 = hydrogen connected to NH2
ATOMTYPE HNR5 = hydrogen connected to NR15
ATOMTYPE HNR6 = hydrogen connected to NR16
ATOMTYPE HOH1 = hydrogen connected to OH1
ATOMTYPE HOH2 = hydrogen of water
ATOMTYPE HSH1 = hydrogen of sulphur
HEADER N Nitrogen PI
ATOMTYPE NS = without hydrogen ( triple bond )
ATOMTYPE NS1 = connected to 1 hydrogen
HEADER N Nitrogen SP2
ATOMTYPE N = without hydrogen ( N of PRO )
ATOMTYPE NC1 = connected to 1 hyd. in a charged group ( NE of ARG )
ATOMTYPE NC2 = connected to 2 hyd. in a charged group ( NH2 of ARG )
ATOMTYPE NH1 = connected to 1 hydrogen ( N of main chain )
ATOMTYPE NH2 = connected to 2 hydrogen ( NE2 of GLU )
ATOMTYPE NPA = without hydrogen ( NA and NC of HEME )
ATOMTYPE NPB = without hydrogen ( NB and ND of HEME )
ATOMTYPE NRD5 = without hydrogen but with electronic doublet in 5 atoms ring
ATOMTYPE NRD6 = without hydrogen but with electronic doublet in 6 atoms ring
ATOMTYPE NR15 = connected to 1 hyd. in 5 atoms ring ( ND1 of HIS )
ATOMTYPE NR16 = connected to 1 hyd. in 6 atoms ring
ATOMTYPE NR5 = connected to 3 atoms in 5 atoms ring ( N9 of ADE )
ATOMTYPE NR6 = connected to 3 atoms in 6 atoms ring ( N1 of CYT )
HEADER N Nitrogen SP3
ATOMTYPE NT = without hydrogen
ATOMTYPE NT1 = connected to 1 hydrogen
ATOMTYPE NT2 = connected to 2 hydrogens
ATOMTYPE NT3 = connected to 3 hydrogens
HEADER O Oxygen SP2
ATOMTYPE O = without NET charge ( O of main chain )
ATOMTYPE OC = with a NET charge ( OE1 OE2 of GLU )
ATOMTYPE OP = with a NET charge connected to P (O1P of phosphate group )
ATOMTYPE OS = with a NET charge connected to S ( O1 of sulphate group )
ATOMTYPE OB = with a NET charge connected to B
HEADER O Oxygen SP3
ATOMTYPE O2 = connected to 2 atoms ( O4' of ribose )
ATOMTYPE OC2 = with a NET charge connected to 2 ATOMS ( O3' of ribose )
ATOMTYPE OH1 = oxygen of alcohol groups ( OG1 of THR )
ATOMTYPE OH2 = oxygen of water
ATOMTYPE OHA = oxygen of water in MO6
ATOMTYPE OHB = oxygen of water in MO6
ATOMTYPE OHC = oxygen of water in MO6
HEADER S Sulphur
ATOMTYPE S = sulphur without hydrogen
ATOMTYPE SH1 = sulphur with a hydrogen ( SG of CYS )
HEADER Fe
ATOMTYPE FE = iron
HEADER P
ATOMTYPE P = phosphorus
HEADER Zn
ATOMTYPE ZN = zinc
END
--- ATOM ---
loop_
_lib_atom.type
_lib_atom.weight
_lib_atom.hb_type
_lib_atom.vdw_radius
_lib_atom.vdwh_radius
_lib_atom.ion_radius
_lib_atom.element
_lib_atom.valency
_type atomic chemical type
_weight atomic weight
_hb_type donor/acceptor type:
N=neither
D=donor
A=acceptor
B=both
H=hydrogen candidate to hydrogen bonding
_vdw_radius Van der Waals radius
_vdwh_radius Van der Waals radius for atom+H
Ionic radii for most of the atoms without hydrogens are:
WebElements
or
Chemistry of the elements by Greenwood and Earnshaw
VDW radii of carbon atoms with hydrogen have been taken from
Li and Nussinov, Proteins, 32 111-127 (1998)
CSP 12.01150 N 1.700 1.700 . C 4
C 12.01150 N 1.700 1.750 . C 4
C1 12.01150 N 1.700 1.820 . C 4
C2 12.01150 N 1.700 1.800 . C 4
. . . . .
--- BONDS ---
loop_
_lib_bond.atom_type_1
_lib_bond.atom_type_2
_lib_bond.type
_lib_bond.const
_lib_bond.length
_atom_type atomic chemical type
_const constant KBOND
_value equilibrium length of this bond BOND0
BOND - actual bond length
ENERGY = KBOND * ( BOND - BOND0 )**2
Values for bond distances and sigmas are from (where it is possible):
International tables for crystallography
Volume C, 1992,
Edited by AJC Wilson
Published for IUCr by Kluwer Academic Publishers, Dordrecht/Boston/London
For carbon-carbon etc
Section: Typical Interatomic distances: Organic compounds
Authors: FH Allen, O Kennard, DG Watson, L Brammer, AG Orpen and R Taylor
pages: 685-706
For metal radii and distances.
Section: Typical interatomic distances: Organometallic Compounds and
Coordination complexes of the d- and f-block metals
Authors: AG Orpen, L Brammer, FH Allen, O Kennard, DG Watson and R Taylor
pages: 707-791
C C single 420.0 1.550 0.025
C C double 420.0 1.330 0.020
. . . . .
--- ANGLES ---
loop_
_lib_angle.atom_type_1
_lib_angle.atom_type_2
_lib_angle.atom_type_3
_lib_angle.const
_lib_angle.value
_atom_type atomic chemical type
_const constant KTHETA
_value equilibrium value for the angle THETA0
THETA - actual angle
ENERGY = KTHETA * ( THETA - THETA0 )**2
NS CSP CH3 . 180.000
NS CSP SH1 . 180.000
. . . . . . .
--- TORSIONS ---
loop_
_lib_tors.atom_type_1
_lib_tors.atom_type_2
_lib_tors.atom_type_3
_lib_tors.atom_type_4
_lib_tors.label
_lib_tors.const
_lib_tors.angle
_lib_tors.period
_atom_type atomic chemical type
_const Constant KPHI
_period NPH number of minima in the function
_angle target angle DELTA
PHI - actual angle
ENERGY = KPHI * ( 1 - COS( NPH * (PHI - DELTA) ) )
. CH1 NH1 . . 0.000 0.000 3
. C CH1 . . 0.000 180.000 3
. . . . . . ..
--- VDW contacts ---
loop_
_lib_vdw.atom_type_1
_lib_vdw.atom_type_2
_lib_vdw.energy_min
_lib_vdw.radius_min
_lib_vdw.H_flag
_atom_type atomic chemical type
_energy_min EPSij minimum of energy parameter
_radius_min Rmin radius of the minimum of energy parameter
Rij - actual distance
_H_flag "h" - the parameters for atoms with hydrogens
Lennard-Jones potential
ENERGY = EPSij * ( (Rmin/Rij)**12 - 2 * (Rmin/Rij)**6 )
C C -0.19686 3.600 .
C C -0.19686 3.600 h
. . . . .
--- H-BONDS ---
loop_
_lib_hbond.atom_type_1
_lib_hbond.atom_type_2
_lib_hbond.min
_lib_hbond.dist
_atom_type atomic chemical type
_hbond_min EPSHij energy at minimum
_hbond_dist RHmin distance at minimum of energy
RHij - actual distance
ENERGY = EPSHij * ( (RHmin/RHij)**12 - 2 * (RHmin/RHij)**10 )
NRD5 NH1 -1.500 2.850
NRD5 NT3 -1.500 2.850
. . . . . .
- reads library of monomers, gives information about some monomer
- creates PostScript file with picture and information about bonds, angles, ...
- can read additional library, combine two libraries and write to a new library file
- can create description of new monomer reading coordinates from PDB file of CIFile
Authors:
A.Vagin and
G.Murshudov, E.Dodson, K.Henrick,
J.Richelle, S.Wodak.