dmmulti HKLIN1 foo1.mtz [HKLIN2 ...] HKLOUT1 bar1.mtz [HKLOUT2 ...]
[SOLIN1 foosol1.msk [SOLIN2 ...]] [SOLOUT1 barsol1.msk [SOLOUT2 ...]]
[MSKIN1 foomsk1.msk [ MSKIN2 ...]]
[Keyworded input]
K. Cowtan (1994), Joint CCP4 and ESF-EACBM Newsletter on Protein Crystallography, 31, p34-38.
'dmmulti' is a package which applies real space constraints based on known features of a protein electron density map in order to improve the approximate phasing obtained from experimental sources. The program may be applied to data from one or more crystal forms simultaneously. Various information can be applied, including such diverse elements as the following (see keyword MODE):
A discriminator analogous to the crystallographic free-R factor (3) plays an important part in the procedure, providing a good indication of the effectiveness of a particular density modification calculation, and also an accurate method for determining weights for new phases calculated by the procedure. This avoids the problems of over-consistency and overestimated weights which could arise in earlier density modification procedures. Note that the dm-free-R is not truly a free-R factor since it is impossible to completely isolate a set of reflections: all structure factor magnitudes are fundamentally interrelated through the density constraints in real space.
The program can either use a free-R set from the mtz file, or generate its own set internally. It is also possible to recycle the calculation, performing the density modification once or more times with different free-R sets, and then once with no free-R set but using the information obtained from the free-R cycles. This has been found to give a slight improvement in the overall results.
Calculation of scale and B-factor for the data are automatic. This is performed by comparison with an empirically derived database of map variance at different resolutions, and is more reliable than the conventional Wilson plot.
Non-crystallographic symmetry averaging can be performed for both proper and improper symmetries, and different NCS averaging operations can be applied to different parts of the protein (Thanks to Dave Schuller for his help with this). Spectral B-spline interpolation is used for fast calculation on a low resolution grid; this has been developed by Dr. Eric H. Grosse.
Multi-crystal averaging can be performed between phased and/or unphased forms. The calculation is performed efficiently and entirely in-core.
Skeletonisation is by the core-tracing algorithm of Swanson (7). This is faster than Greer's algorithm and allows adjustment of the skeletonisation parameters without recalculating the skeleton. As a result the skeletonisation calculation is rendered largely automatic.
Operation is by standard keyworded card input. Input masks may be on any grid and axis order, however if the mask grid is too fine the program may run out of space to store it.
Input mtz file for i'th crystal form - This should contain the conventional (CCP4) asymmetric unit of data (see CAD).
Input solvent mask for i'th crystal form - This overrides the automatic Wang mask determination. The input mask can have any grid and axis ordering, and may have any extent from the protein region of a single asymmetric unit to the whole cell.
Input averaging masks for j'th domain - These are used with the AVER option. The input masks can have any grid or axis ordering, and should cover a single monomer or domain, however correct results will still be obtained (more slowly) if the mask covers a proper symmetry related multimer. 'dmmulti' does not perform overlap removal.
Output mtz file for i'th crystal form.
Output solvent mask for i'th crystal form - This will be on the program grid with default axis order, and will cover the whole unit cell.
Input is keyworded. Available keywords are:
AVERAGE, GRID, LABIN, LABOUT, MODE, NCYCLE, RESOLUTION, SCALE, SCHEME, SKEL, SOLC, WANG, XTAL.(MODE and SOLC are compulsory)
Select the <xtal>'th crystal form. Keywords following the keyword will apply to this crystal form. Any keywords before the first XTAL card apply to form 1.
Select the calculation to be performed:
Resolution range of reflections to be included in the calculation. By the end of the calculation all the reflections in this range will be included, however at the start only a subset are used, chosen on the basis of the SCHEME card (default is the whole range of the input mtz file).
Number of cycles of phase extension to perform (defaults <ncycle>=10 <ncross>=1).
The total time taken is proportional to the product of these two values. Use <ncross> = 1 for large structures where the time becomes a significant factor, otherwise use <ncross> = 2. Only use <ncross> > 2 for small structures where the statistics are particularly poor (< 5000 reflections).
In the case of a multi-crystal calculation only one NCYCLE card is allowed, which applies to all forms.
(default: AUTO)
Normally just the first four columns (FP,SIGFP,PHIO,FOMO) are input. However if you have Hendrickson-Lattman coefficients you may want to input these to the program as well (the difference is marginal except for SIR data). If you want to start from the end of a previous density modification calculation then the PHIDM, FOMDM columns are used.
For multi-crystal averaging, if a crystal form is unphased the PHIO and FOMO columns may be omitted. There should be some sort of phases for at least one form.
Normally just the first two columns are output. Don't use the other two unless you are a very clever person.
Perform iterative skeletonisation on the map. Cycles of skeletonisation are interspersed with cycles of conventional density modification (defaults <joinlen>=6.0 <endlen>=6.0 <bfac>=45 <nskl>=3).
Set a NCS symmetry averaging operator. This card is followed by one rotation/translation matrices on subsequent lines in either CCP4 or O/RAVE format (defaults <dr>=0.5 A, <dphi>=2.5 degrees, <nref>=3).
These are the operations which map the density in the region covered by the input mask onto the appropriate regions in the current crystal form. The first operator must be the identity matrix. The mask is input in CCP4 mask (map mode 0) format on the input file label MSKIN1, and should cover just one monomer or averaging domain, NOT the whole unit cell. The mask grid need not agree with the program grid.
If you want to apply different ncs operations to different domains of the protein, then each AVER card should contain a DOMAIN card to indicate which to domain this operator applies.
The REF, STEP and EVERY cards will enable refinement of the ncs rotation matrices between averaging cycles. The REF card enables the refinement of a particular set of NCS parameters. Note that the STEP card allows different refinement step sizes can be used for different domains, however all but one EVERY card will be ignored. The refined matrices will be written out at the end of the log file.
See also dm_ncs_averaging
Set the grid for the calculation. You may want to do this if you want to include your own mask or dump a map or mask (defaults: minimum efficient factors above Nyquist spacing).
Set the averaging radius and mode for calculating the solvent mask (defaults: <radius>=8.0 <mode>=1 <rhomin>=0.32 <rhomax>=2.0 e/A^3).
Heavy atoms can bias the mask calculation procedure, resulting in a mask of spheres around the heavy atom sites. The LIMITS card can be used to set the values at which the electron density is truncated before smoothing. To truncate heavy atoms set <rhomax> to the maximum electron density due to non-heavy atoms at the appropriate resolution.
Override internal scaling and scale input data by F^2 = <scale> * exp (<bfac> * s / 2.0) * F^2 Scaling is critical to histogram mapping and Sayre's equation. In some cases you may want to override the B-factor, but run without this card first, and consider long and hard before changing scale.
Look at the free-R factor: but you will have to disentangle the output for the different crystal forms.
The script 'multilog' can be used to roughly separate those portions of the output dealing with different crystal forms. Type:
> multilog name-of-your-dmmulti-logfile
The XTAL keyword for separating keywords for different forms is new.
The format of the AVER keyword is consistent with dm version 1.8 and later
There are now multiple input and output reflection and solvent masks for the various forms.
Only the last NCYC or EVERY keyword in the command file will have any effect, since cycles must be synchronised across the different forms. Only the last REF or STEP keyword in any crystal form will have an effect and will apply for all matrices in that form.
Refinement of averaging operators only works when the first operator given for each domain is the identity. This restriction does not apply when averaging without refining the operators.
Check the averaging correlation on the first cycle: this is a strong indication of whether the mask and matrices have been correctly determined.
Averaging operators must be FROM the masked region TO the copy in the unit cell. All averaging operators are defined in orthogonal coordinates using the conventional CCP4/Uppsala axis conventions.
Kevin D. Cowtan, Department of Chemistry, University of York
email: cowtan@ysbl.york.ac.uk
dmmulti \ hklin gmto.mtz \ hklout gmtodm.mtz \ histlib dm/hist.lib \ << 'my-data' SOLC 0.35 MODE SOLV HIST NCYCLE 10 LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM LABOUT PHIDM=PHI1 FOMDM=W1 'my-data'
dmmulti \ hklin gmto.mtz \ hklout gmtodm.mtz \ histlib dm/hist.lib \ << 'my-data' SOLC 0.35 MODE SOLV HIST NCYCLE 10 FREE 2 SCHEME RES FROM 3.0 LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM FREE=FreeR_flag LABOUT PHIDM=PHI1 FOMDM=W1 'my-data'
dmmulti \ hklin hpattj.mtz \ hklout dm1.mtz \ mskin1 cwnads.mask \ mskin2 cwglobs.mask \ histlib /usr/people/schuller/dm/hist.lib \ << 'EOF-dm' SOLC 0.57 MODE SOLV HIST AVER NCYCLE 40 AVERAGE DOMAIN 1 OMAT 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 AVERAGE DOMAIN 1 OMAT -0.71389002 -0.69492584 0.08611962 -0.69635397 0.69129372 -0.19136506 0.07357326 -0.19652288 -0.97735721 115.37364197 54.98566055 67.00005341 AVERAGE DOMAIN 2 REFINE OMAT 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 AVERAGE DOMAIN 2 REFINE OMAT 0.75830859 0.65183645 0.00883542 0.65189570 -0.75824565 -0.00975925 0.00033828 0.01316060 -0.99991322 17.30371666 -47.10081482 68.99727631 LABIN FP=FP SIGFP=SIGFP PHIO=PHIml FOMO=FOMml - HLA=HLA HLB=HLB HLC=HLC HLD=HLD LABOUT PHIDM=PHIDM FOMDM=FOMDM 'EOF-dm'
dmmulti \ hklin1 hkl/gmtomir.mtz hklin2 hkl/gmtmmir.mtz \ hklout1 dmgmto.mtz hklout2 dmgmtm.mtz \ mskin1 gmto.msk <<+ NCYC 10 XTAL 1 SOLC 0.35 MODE SOLV HIST AVER AVER REFI ROTA POLAR 0 0 0 TRAN 0 0 0 LABIN FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM XTAL 2 SOLC 0.41 MODE SOLV HIST AVER AVER REFI ROTA MATR 0.74198 0.34530 0.57466 0.52980 0.22324 -0.81821 -0.41082 0.91155 -0.01730 TRAN -27.92476 -10.49614 -11.78758 LABIN FP=FP SIGFP=SIGFP END +
dmmulti \ hklin1 ins6a.mtz hklout1 dmins1.mtz \ hklin2 ins_hagfish_tetr_T_dim.mtz hklout2 dmins2.mtz \ hklin3 ins_mi3_crosslinked_fred_p321.mtz hklout3 dmins3.mtz \ mskin1 insab.msk \ << + NCYC 500 XTAL 1 RESO 1000 2.0 SCHEME RES FROM 6.0 MODE SOLV HIST AVER SOLC 0.30 AVER ROTATION MATRIX: 1 0 0 0 1 0 0 0 1 TRANSLATION 0 0 0 AVER ROTATION MATRIX -0.87108 -0.49050 0.02492 - -0.49025 0.87144 0.01588 - -0.02951 0.00162 -0.99956 TRANSLATION -0.18740 0.11924 -0.66475 LABI FP=FP SIGFP=SDFP PHIO=AISOB FOMO=FOM XTAL 2 RESO 1000 2.0 SCHEME RES FROM 6.0 MODE SOLV HIST AVER SOLC 0.50 AVER ROTATION MATRIX 0.46802 0.82899 0.30616 - -0.81508 0.53880 -0.21293 - -0.34148 -0.14989 0.92786 TRANSLATION 3.90866 3.11148 1.14348 LABI FP=FP SIGFP=SIGFP XTAL 3 RESO 1000 2.0 SCHEME RES FROM 6.0 MODE SOLV HIST AVER SOLC 0.40 AVER ROTATION MATRIX 0.71822 -0.69491 -0.03563 - -0.69556 -0.71840 -0.00954 - -0.01897 0.03164 -0.99932 TRANSLATION 0.24079 45.93060 9.55959 AVER ROTATION MATRIX -0.26303 -0.96446 0.02510 - 0.96442 -0.26356 -0.02065 - 0.02653 0.01877 0.99947 TRANSLATION 0.60388 45.35286 10.53205 AVER ROTATION MATRIX 0.68837 0.72534 0.00538 - -0.72535 0.68836 0.00420 - -0.00066 -0.00680 0.99998 TRANSLATION -0.45315 0.17668 0.38123 LABI FP=FMI3 SIGFP=SMI3 +