FFFEAR (CCP4: Supported Program)

NAME

fffear - Fast Fourier Feature Recognition for density fitting, release 1.9, 19/10/00

SYNOPSIS

fffear HKLIN foo.mtz [XYZIN foo.pdb] [MAPIN foo.map] [MAXIN foo.max] XYZOUT bar.pdb [ SOLIN foo.msk ]
[Keyworded input]

REFERENCE

DESCRIPTION

`fffear' is a package which searches for molecular fragments in poor quality electron density maps. It was inspired by the Uppsala `ESSENS' software (Kleywegt+Jones, 1997), but achieves greater speed and sensitivity through the use of Fast Fourier transforms, maximum likelihood, and a mixed bag of mathematical and computational approaches (Cowtan, 1998). Currently, the main application is the detection of helices in poor electron density maps (5.0A or better), and the detection of beta strands in intermediate electron density maps (4.0A or better). It is also possible to use electron density as a search model, allowing the location of NCS elements. Approximate matches may be refined, and translation searches may be performed using a single orientation. The results are scored using an agreement function based on the mean squared difference between model and map over a masked region.

The program takes as input an mtz file containing the Fourier coefficients of the map to be searched, and a search model in the form of a pdb file, map, or maximum likelihood target. A `fragment mask' is generated to cover the fragment density, and orientations and translations are searched to find those transformations which give a good fit between the fragment density and map density within the fragment mask.

The program has been highly optimised using reciprocal-space rotations and grid-doubling FFT's, and crystallographic symmetry (Rossman+Arnold, 1993) giving 4-50 times speed improvement over the results published in 1998. The speed of the calculation is almost independent of the size of the model, thus the program may also be used for molecular replacement calculations where weak phases are available.

A maximum likelihood search function is under consideration for future versions.

`fffear' Recipes

For a map with 3.5A phases try something like this:
fffear hklin ~/hkl/gmto-unique.mtz xyzin alpha-helix-10.pdb xyzout alpha10-rot.pdb << eof

SOLC 0.35
SEARCH STEP 10
RESO 1000.0 3.5
CENTRE ORTH   7.464  16.169  16.893
LABI FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM
END
eof
At 5.0A some of the fragment density is no longer localised. This can cause a mismatch between the fragment and protein density. One solution is to use the 'FILTER MAP' keyword to match the map and fragment densities. A better option is to use the 5.0A maximum likelihood search target. The search target is provided on MAXIN, a model is also provided for visualisation purposes only:
fffear hklin ~/hkl/gmto-unique.mtz maxin ml-helix-9-5.0A.max xyzin ml-helix-9.pdb xyzout alpha10-rot.pdb << eof

SOLC 0.35
SEARCH STEP 15
RESO 1000.0 5.0
FILTER MAP RADIUS 6.0
CENTRE ORTH   7.464  16.169  16.893
LABI FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM
END
eof
When the search model is large (i.e. molecular replacement calculations or density fragment searches to find NCS of cross-crystal operators), the search can be foiled by long range variations in either the map or fragment density. In this case filtering should be applied to both the map and the search model:
fffear hklin ~/hkl/rnase-unique.mtz mapin rnase-mol.map << eof

SOLC 0.35
SEARCH STEP 15
MASK RADI 2.5
RESO 1000.0 5.0
FILTER MAP MODEL RADIUS 6.0
CENTRE ORTH   7.464  16.169  16.893
LABI FP=FP SIGFP=SIGFP PHIO=PHIB FOMO=FOM
END
eof
In the case of molecular replacement calculation and NCS searches it is important that the search model and map should be scaled correctly:
  1. To perform molecular replacement with a sharpened model (like the supplied helix fragments), no SCALE keyword is required. Ensure all the B-factors in the input pdb are set to 0.00.
  2. To perform molecular replacement with a natural-B model, specify SCALE NATURAL.
  3. To perform an NCS search with density cut from the same map, both maps will be on the same scale already, so you must specify SCALE 1.0 0.0.
  4. To refine the result of an MR or NCS search, use the MODEL ROTATE keyword with the approximate orientation, and change the SEARCH STEP and RANGE.
FILTER MAP is useful in all three cases to match the mean of the map and the fragment.

INPUT/OUTPUT FILES

The structure factors and estimated phases must be provided in HKLIN. The search model must be specified by an coordinate model or map fragment in XYZIN or MAPIN.

HKLIN

Input mtz file - This should contain the conventional (CCP4) asymmetric unit of data (see CAD).

The mtz file should contain all reflections to the limit of the measured diffraction pattern, since all the reflections are used to accurately scale the data. However, only those reflection with phases, to the resolution limit specified by the compulsory RESOLUTION keyword, will actually be used in the search procedure.

XYZIN

Input pdb file. This may contain an arbitrary crystal header, or none at all. The only restriction is that the atomic coordinates are given in Angstroms on arbitrary orthogonal axes. The B-factors of the input atoms should be set to an average value of (around) zero. Normally, all B-factors should be equal, unless some prior information about the B-factors of atoms in the desired fragment is available. (It is legitimate for example to make the B-factors of the C-beta or Oxygen atoms higher).

MAPIN

Input map file. This may be specified as an alternative to XYZIN to perform a search for NCS or cross-crystal operators. The map should contain the search density, placed in a cubic cell. Regions of the map outside the search mask should be set to zero. The search map will usually be generated using maprot in map cutting mode, e.g.

maprot wrkin rnase-mir.map mskin rnase-mol.msk cutout rnase-mol.map << eof
MODE TO
CELL UNIT 100 100 100 90 90 90
GRID UNIT 150 150 150
AVER 1
ROTA POLAR 0 0 0
TRAN 0 0 0
eof
If the input map is calculated for the same structure factors which are given to fffear, the scaling can be overridden using SCALE 1.0 0.0.

An input coordinate file may also be provided on XYZIN. This will not be used for the search, but will be rotated and output for visualisation purposes.

MAXIN

Input maximum likelihood search target. This is used in the same way as an input map, however it also contains density variance information. (Special software is used for the construction of ML targets for fffear). ML targets are resolution dependent, so the appropriate target should be used in conjunction with the RESOLUTION keyword. XYZIN may again be used for visualisation purposes.

XYZOUT

Output pdb file. This contains multiple copies of the input fragment, rotated and translated to the positions of the best matches between the fragment density and and map density. The fragments are sorted in order of quality, with the best first. The b-factor is set to the value of the search function, with low values representing a good fit.

Good matches to major secondary structure features are usually obvious because several fragments link up or overlap in sensible manners. At better than 4.0A resolution, the direction of the chain is commonly correct as well.

The output pdb file may be further analysed with `ffjoin'.

MAPOUT

A map of the best fragment fit at each position in the map. Values closest to zero represent the best fit.

SOLIN

Input mask - this is used as a filter for the results. Any rotation/translation solutions whose centre-of-mass falls in the solvent (zero) region of the mask will be excluded from the output. If no mask is given, the whole cell is allowed.

Generally there is no point providing a solvent mask, since the solvent density generally does not provide a match to atomic features. However this may be useful when fitting a molecular replacement map from a very incomplete model, to exclude hits to the MR model.

KEYWORDS

Input is keyworded. Available keywords are: SOLC, LABIN, RESOLUTION, MODEL, MASK, FILTER, SEARCH, SCALE, CENTRE, GRID, FORM, TRUNCATE, STRUCFAC.

BASIC KEYWORDS

(SOLC and LABIN are compulsory. RESOLUTION is strongly recommended.)

SOLC <solc> [MEAN <solvval> <protval>]

<solc>
solvent content for scaling. Always input the correct solvent content here to ensure correct scaling. 0.0=all protein, 1.0=all solvent.
MEAN <solvval> <protval>
used to set mean density for solvent and protein regions. This affects scaling and density modification.
<solvval> = mean density in solvent region.
<protval> = mean density in protein region.
(defaults 0.32, 0.43 electrons per cubic angstrom)

LABIN FP=.. [SIGFP=..] PHIO=.. [FOMO=..]

Enough columns must be provided to allow calculation of a map. Common combinations include (calculated_magnitude + phase), (observed_magnitude + phase + weight), (weighted_magnitude + phase).

FP
= F magnitude
SIGFP
= standard deviation, 0 for unmeasured
PHIO
= best initial phase estimate
FOMO
= weight attached to PHIO

RESOLUTION <rmin> <rmax>

Resolution range of reflections to include in the translation search stage of the calculation. This should be set to cover the resolution range for which significant phase information is available. Good results are obtained with phases to 4.0A or better; for larger fragments (10 residues or more) information may be obtained at still lower resolutions.
(default is the whole range of the input mtz file)

MODEL [RADIUS <mdlrad>] [RESOLUTION <mrmax>] [BFACTOR <bfac>] [ROTATE <alpha> <beta> <gamma>]

Set the parameters for the model atoms.

<mdlrad>
is the radius over which the atomic density is calculated (as VDWR in sfall).
<mrmax>
is the resolution at which the atomic shape functions are calculated. At higher resolutions this should make little difference, however at lower resolutions a significant amount of the atomic density leaks out of the fragment mask, and so better results are obtained if the atomic shape function is corrected. If <mrmax> is set to zero, resolution truncation is disabled and the true shape function is used.
<bfac>
Temperature factor to add to all the atoms in the input fragment. By default the search is conducted in a map sharpened to Boverall=0. If a different type of map is used through use of the SCALE card, the B-factors of the model atoms should be adjusted accordingly.
<alpha> <beta> <gamma>
An initial rotation to apply to the model before starting the search. This rotation is included in the output rotation angles. Useful for doing a translation search at fixed orientation, or for refining an MR solution.
Defaults: <mdlrad>=2.5A, <mrmax>=<rmax>, <bfac>=0, <alpha>=<beta>=<gamma>=0.

MASK [RADIUS <mskrad>]

Set the radius of the fragment mask about the model atoms. This determines the volume over which the agreement between the map and the model are compared. Defaults: <mskrad>=2.5A.

FILTER [MAP] [MODEL] [VARIANCE] [RADIUS <fltrad>]

Apply a filter to the map and/or model before starting the search. The filter may match either the local mean (default) within the filter radius, or it may match both the local mean and variance. This is useful at low resolutions or when performing MR or NCS searches.

MAP
apply filter to map.
MODEL
apply filter to search model.
VARIANCE
match both local mean and variance.
<fltrad>
radius of sphere for local mean/variance calculation.
Defaults: do not filter either map or model, do not match variance, <fltrad>=5.0A

SEARCH [STEP <step>] [RANGE <beta>] [PEAKS <npeaks>] [GRID <sampling>]

Set the parameters for the search function.

<step>
is the orientation step angle in degrees. At 4.0A resolution or better, a search step of 10 degrees is sufficient for a 10 residue fragment. The search step should be smaller at higher resolutions or for larger models. A quick and dirty result may often be obtained with a much coarser search step, as large as 30 degrees.
<range>
Set the maximum range in degrees for the search. To perform a translation search on a single orientation, set <range> to 0.1. To refine the result of a previous search, use SEARCH STEP 2 RANGE 10 and give the approximate orientation with MODEL ROTATE.
<npeaks>
is the number of peaks in the resulting search function for which rotated fragment atoms will be output. Defaults: <step>=10degrees, <npeaks>=100.
<sampling>
is the factor by which the search grid is oversampled. Higher values give finer search grids, and potentially better results at the cost of time. Default: <sampling>=1.333.

SCALE [NATURAL] [<scale> <bfac>]

Override internal scaling and scale input data by:
    F2 = <scale> exp (<bfac> s/2) Fobs2
Scaling is critical to correctly fitting the density with a model. The data scale will be accurately determined automatically if structure factor magnitudes are provided to better than 4.5A, otherwise it is a good idea to provide a SCALE card.

NATURAL
place the map on an absolute scale but do not adjust the B-factor.

CENTRE FRAC/ORTH <x> <y> <z>

Center the output fragment positions in an asymmetric unit around <x> <y> <z>, given in fractional or orthogonal coordinates in accordance with the preceding keyword. Useful to put your matches in the same region and any model you are working on.

OTHER KEYWORDS

Don't use these unless you really know what you are doing. In which case you'd better have a better idea of what you are doing than I do.

GRID <nx> <ny> <nz>

Set the grid for the calculation. Ideally the grid spacing should be 1/5 of the resolution of the phases, thus for 4.0A phases the grid spacing must be 0.8A. Spacings greater than 1/4 of the resolution will cause an error. Grid sampling must be a multiple of 4 and obey any other requirements imposed by the spacegroup.

FORM <z> <a1> <b1> <a2> <b2>

Alternate 2-gaussian formfactor coefficient for atomic number <z>. f=<a1>exp(<b1>s)+<a2>exp(<b2>s). Formfactors are supplied for H, N, C, O, S and other atom types are scaled from these. Given that the model B-factors will generally be wrong, a crude approximation is sufficient for all common cases.

TRUNCATE <rmin> <rmax>

Resolution range of reflections to include in the data scaling stage. This keyword can be used to exclude part of the input data by resolution cutoffs. This is generally highly inadvisable.
(default is the whole range of the input mtz file)

STRUCFAC [REAL <rscale>] [RECIP <hscale>]

Use a (slow) direct Fourier to calculate the model and mask structure factors instead of the default FFT. The REAL and RECIP keywords may then be used to set the spacing of the real space grid used to calculate the fragment density and mask, and the reciprocal space sampling of the fragment and mask transforms.

<rscale>
The spacing in Angstroms of the grid on which the fragment model density and mask are calculated. Default 0.5 A.
<hscale>
The spacing in reciprocal Angstroms of the grid on which the fragment and mask transforms (structure factors) are calculated. This grid needs to be quite fine to ensure accurate interpolation when the fragment transform is related. The default value is sufficient for a model whose longest axis is 15A, for larger models decrease this number proportionally. Default 0.01 A-1.

OUTPUT

The output PDB file (XYZOUT) contains up to 1000 copies of the input molecule in decreasing order of fit to the density. For the purposes of visualisation I find it useful to get the header and the first 250 C-alpha atoms from this file, as follows:
grep 'C[AR]' XYZOUT | head -250 > ca.pdb

The translation function map (which omits the orientation information) is also output on MAPOUT. This has peaks where the origins of the good orientations are found. If the input model has an alpha carbon at the origin a rough backbone trace of map regions matching the fragment may be obtained.

AUTHOR

Kevin D. Cowtan, Department of Chemistry, University of York
email: cowtan@ysbl.york.ac.uk

REFERENCES

  1. K. Cowtan (1998), Acta Cryst. D54, 750-756. Modified phased translation functions and their application to molecular fragment location.
  2. Kleywegt G. J., Jones T. A. (1997) Acta Cryst., D53, 179-185. Template convolution to enhance or detect structural features in macromolecular electron-density maps.
  3. Rossman M. G., Arnold E. (1993) International Tables for Crystallography Volume C, Section 2.3: Patterson and molecular replacement techniques (Kluwer Academic Publishers).

SEE ALSO

fffear fragment library, ffjoin, maprot, xloggraph

EXAMPLES

fffear