PHISTATS (CCP4: Supported Program)

NAME

phistats - Analysis of agreement between phase sets, and checking it against weighting factors.

SYNOPSIS

phistats HKLIN foo_in.mtz
[Keyworded input]

DESCRIPTION

PHISTATS analyses the differences between two sets of phases. The analysis can be binned against two types of weights; for example a figure of merit, and the magnitude of Fobs. It is probably more informative to do map correlation using OVERLAPMAP.

KEYWORDED INPUT

The various data control lines are identified by keywords. Only the first 4 characters are significant. Those available are:
END, LABIN, RANGES, RESOLUTION, SHIFT, HAND, TITLE

LABIN <program label>=<file label> ...

Input column assignments. The program labels are: FP SIGFP PHIBP WP PHIB2 W2. For details of these, see INPUT AND OUTPUT FILES.

RANGES <nbin> [ <mon> ]

Set the number of resolution bins <nbin> and the reflection monitoring interval <mon>.

<nbin> is the number of resolution bins (equal width in [sin(theta)/(lambda)]**2) in which to divide partial structure data for normalization and sigmaA estimation. It is IMPORTANT that resolution ranges contain sufficient reflections. It is best to use as large a value of <nbin> as possible, as long as the estimates of sigmaA vary smoothly with resolution. If they do not, <nbin> should be reduced until sigmaA does vary smoothly. A good first guess is the number of reflections divided by 1000. If sigmaA refinement converges to zero in one or more of the ranges (which happens sometimes when the correct value is low), this can usually be circumvented by decreasing <nbin>.

Information about every <nmon>-th reflection will be written to the log file.

Defaults: 20 1000; maximum <nbin> allowed: 50.

RESOLUTION [ <rmin> ] <rmax>

Low and high resolution limits in either order or upper limit if only one is specified. These are in Angstroms or, if both are <1.0, units of 4(sin theta/lambda)**2. By default, all the data in the file are used.

SHIFT <X_fracshift Y_fracshift Z_fracshift>

PHI2 phases adjusted for a fractional Shift - especially useful when the two phase sets refer to different crystal origin:

  PHI2_used = PHI2_input + 2PI(h X_fracshift + k Y_fracshift + l Z_fracshift)

HAND

PHI2 phases adjusted to change hand.

  PHI2_used = -PHI2_shifted + 2PI(h CX + k CY + l CZ) 
where CX,CY,CZ are the centre of symmetry for this space group. 
(CX,CY,CZ) is (0,0,0) except for spacegroups
I41, I4122, F4132,I4132. See reindexing notes.

TITLE <title>

A title written to the log file.

END

End of input.

INPUT AND OUTPUT FILES

INPUT

This is an MTZ file assigned to logical name HKLIN. The following column assignments are required:
FP, SIGFP
native amplitude and standard deviation;
PHIBP
first phase (degrees), maybe an isomorphous phase;
WP
first `weight' to analyse against, for example, the figure of merit;
W2
second `weight' for analysis, for example, the native amplitude.

PHIB2 may optionally be assigned. This is the second phase (degrees). If it is not assigned the program gives the correlation between WP and W2.

OUTPUT

There is no output file from this program.

Normally the program compares two sets of phases. They can be any set of phases you like, not just experimental phases against calculated model phases. Obviously, if you have calculated phases from a model there is no experimental weight. These phases are broken up into those from centric reflections and acentric.

Since centric reflections have a limited number of possible values PHISTATS compares the agreement between phases. That is if the phases are the same they agree but if they are different they disagree. Thus if the fraction that agree is unity then all the centric phases are equivalent.

The correlation with the weights is exactly that. The linear correlation coefficient is calculated between the phase difference and a weight. It is calculated twice, once for WP and then W2. This coefficient can range between 1.0 and -1.0. The optimum set of weights would produce a correlation of -1.0 because this would mean that the largest weights would correspond to the smallest phase error. The linear correlation coefficient is also calculated between weight and cos(phase_difference).

There are similar calculations made for acentric reflections, however in this case a phase error or difference is calculated. Also, an estimated phase error is calculated. This is based on the principles used in SIGMAA where a quantity sigma_a is calculated. This is calculated from the two sets of structure factor magnitudes and need not be relevant.

Tables are produced where these quantities are compared against resolution and the value of the weight.

SEE ALSO

overlapmap, sigmaa

AUTHORS

Eleanor Dodson, University of York

EXAMPLES

Phase analysis

#  Assign weight 1 to FOM, weight 2 to FC magnitude.
phistats hklin $CCP4_SCR/toxd_sf_mir << END
TITLE   Phase analysis
RESOLUTION  40. 2.
RANGES 10   500
LABIN FP=FTOXD3 SIGFP=SIGFTOXD3 PHIBP=PHI_mir WP=W_mir -
      PHIB2=PHICtoxd W2=FCtoxd
END

Phase analysis for alternative origin for mir phases and calculated ones

#  Assign weight 1 to FOM, weight 2 to FC magnitude.
phistats hklin $CCP4_SCR/toxd_sf_mir << END
TITLE   Phase analysis
SHIFT 0.5 0.5 0.0
RESOLUTION  40. 2.
RANGES 10   500
LABIN FP=FTOXD3 SIGFP=SIGFTOXD3 PHIBP=PHI_mir WP=W_mir -
      PHIB2=PHICtoxd W2=FCtoxd
END

Phase analysis for other hand

#  Assign weight 1 to FOM, weight 2 to FC magnitude.
phistats hklin $CCP4_SCR/toxd_sf_mir << END
TITLE   Phase analysis
HAND 
RESOLUTION  40. 2.
RANGES 10   500
LABIN FP=FTOXD3 SIGFP=SIGFTOXD3 PHIBP=PHI_mir WP=W_mir -
      PHIB2=PHICtoxd W2=FCtoxd
END

Correlation

phistats hklin os_lu_shhg2_pt_pt4_khg_os2_nat.mtz 
<< END
TITLE   Phase analysis chmi model vs MIR phases
RANGES 20 1000        ! Number of analysis bins, monitor interval
RESOLUTION  100.0 2.6 ! Resolution limits in Angstroms
LABIN FP=FP SIGFP=SIGFP WP=FOM  PHIB2=PHI W2=FP           
END