fhscal hklin foo_in.mtz hklout foo_out.mtz
[Keyworded input]
Derivative to native scale factors are calculated in equi-volume shells in reciprocal space using Kraut's formula (ref 1), generalised to use both centric and acentric data, and applied to the derivative data. This formula takes account of the degree of heavy-atom substitution, but does not require the presence of anomalous differences.
The program also computes a scale factor to put the isomorphous difference Patterson on the correct scale for the vector-space refinement program VECREF.
It also possible to apply the scales to all "scaleable" columns in a dataset (i.e. to F+/- and to the structure intensities; see the LABIN keyword), and this is advisable to avoid mixtures of scaled and unscaled data for a single derivative. For input MTZ files with dataset information, FHSCAL will attempt to check and warn you accordingly if it detects datasets which will be output with such a mixture. In these cases, specifying the AUTO keyword will cause the appropriate scale factor to be applied automatically to all such columns.
Free format using keywords. The following keywords may be used; only the leading 4 characters are significant and the order is immaterial:
AUTO BIAS, CENT, END, LABIN, LIST, RESO, SHELLS, TITLE
The LABIN keyword is always required, the others are optional and assume default values if omitted. Use of BIAS 1 is recommended, provided the standard deviations produced by the data processing program (e.g. SCALA) are reliable. If in doubt, omit BIAS.
It is only necessary to specify FPHn for each dataset on the LABIN line (except in special cases, see below). Other labels can also be specified if desired. The program will then try to identify all "scaleable" columns in the dataset, automatically read them in and then apply the appropriate scale factor determined from FPHn.
This option is intended to prevent a mixture of scaled and unscaled columns within a dataset, e.g. FPHn is scaled but not FPHn(+) and FPHn(-). There are a couple of caveats:
Standard MTZ reflection files are used for input (HKLIN) and output (HKLOUT). The following column labels are used :
H, K, L Standard meaning. FP, SIGFP Native amplitude and sigma.
If only 1 derivative is being scaled:
FPH, SIGFPH Derivative amplitude and sigma. DPH, SIGDPH Derivative anomalous difference and sigma (optional). FPH(+), SIGFPH(+), FPH(-), SIGFPH(-) Derivative amplitudes and sigmas for Friedel pair (optional).
If more than 1 derivative is being scaled (up to 20 per run), the column labels are FPH1, SIGFPH1, [ DPH1, SIGDPH1, FPH1(+), SIGFPH1(+), FPH1(-), SIGFPH1(-), ] FPH2, SIGFPH2, [ DPH2 ... ] etc.
Scales are applied to FPH, SIGFPH and DPH, SIGDPH, FPH(+), SIGFPH(+), FPH(-), SIGFPH(-) if present. All other columns, including those for which no label assignments are given, are output unchanged.
WARNING : Reflections for which there is a derivative measurement but no native and which have a greater value of S than any reflection for which both are measured, will be rejected (because no valid scale can be applied). The rejections must be re-incorporated later when higher resolution native data becomes available.
In order to avoid losing reflections in the scaling procedure, it is worth considering using the dataset with the highest resolution limit as the reference (i.e. 'native') dataset in FHSCAL.
After echoing the input data, a table with the following columns is produced for each derivative:
Overall scale and temperature factors are determined from a Wilson plot and printed with their estimated standard deviations; however the scale factors actually applied to the derivative data are obtained by interpolating the shell scale factors.
At the end, the V factor (pseudo-cell volume) for the FFT program for use in computing a correctly scaled isomorphous difference Patterson is given:
In addition to the usual MTZ file opening errors:
ERROR(S) IN DATA: syntax errors were found in the general equivalent positions. Check for spurious characters, missing commas, extra commas etc.
ERROR - NO REFLECTIONS: no common reflections were found. Check column assignments, check reflection listing.
NO REFLECTIONS IN SHELL n. Try using smaller number of shells. Reflections may be missing in a resolution range.
Kraut's formula can be derived by equating the Patterson origins
K^2 . sum FPH^2 = sum FP^2 + sum FH^2 (1)
where FPH, FP and FH are derivative, native and heavy-atom amplitudes respectively, and K is the derivative scale to be determined.
For acentric reflections :
<FH^2>a ~= 2.<(K.FPH - FP)^2> (2)
Elimination of the unknown FH from (1) and (2) gives a quadratic equation for K, the solution of which is :
K = (2.sum FP.FPH - sqrt(4.(sum FP.FPH)^2 - 3.sum FP^2 . sum FPH^2)) / sum FPH^2 (3)
Note that in the original reference the leading factor given as 1/2 should be 2. This formula is valid only for acentric reflections. However it can easily be generalised to include centrics by noting that
<FH^2> ~= <M.(K.FPH - FP)^2> (4)
where M = 1 for centric and 2 for acentric, so using (4) instead of (2) :
K = (sum M.FP.FPH - sqrt((sum M.FP.FPH)^2 - sum (M+1).FP^2 . sum (M-1).FPH^2)) / sum (M-1).FPH^2 (5)
The numerator and denominator of (5) could be zero if all reflections in a shell were centric; this is unlikely, but just in case the equivalent formula can be used instead :
K = sum (M+1).FP^2 / (sum M.FP.FPH + sqrt((sum M.FP.FPH)^2 - sum (M+1).FP^2 . sum (M-1).FPH^2)) (6)
This formula is modified slightly to take into account the bias effect when averaging the squares of differences, i.e. the term <M.(K.FPH - FP)^2> is replaced by : <M.(K.FPH - FP)^2 - M.((K.sigma(FPH))^2+sigma(FP)^2)> where the sigma's have been multiplied by the BIAS factor.
The program reads the input control data, then makes a first pass through the reflections to get the resolution limits (can be controlled by RESO card), then a second pass to flag reflections as centric or acentric and accumulate the sums in shells for the scale factors. The scale factors are calculated and smoothed, and applied in a third pass. The program also computes a scale factor to apply to the isomorphous difference Patterson for use in the program VECREF:
Kv = (sum (FPH-FP)^2)c + 2.(sum (FPH-FP)^2)a) / (sum (FPH-FP)^2)c + (sum (FPH-FP)^2)a)
Ian Tickle, Birkbeck College, London