DIFFLAUE - PROCESS DIFFERENCE DATA ================================== INTRODUCTION The program DIFFLAUE is used to prepare an MTZ file containing difference Laue data scaled to a reference (e.g. monochromatic) set of data. It would typically be used for preparing input data for a difference Fourier calculation for dynamic studies on protein structures etc. In contrast with LAUEDIFF which uses scaled film packs of data as input, DIFFLAUE uses data from individual pairs of films and only merges the measurements at the final stage of the processing. The program was written by J.W. Campbell, Daresbury Laboratory. List of sections: Data Control Cards Input and Output Files Running the Program Notes Printer Output Error Messages Program Function References Examples Outline flowcharts of LAUEDIFF and DIFFLAUE DATA CONTROL CARDS Data Card 1 NFILMS INTYP ISPTYP IFILM(1)...IFILM(NFILMS) NFILMS is the number of pairs of films to be used. INTYP is the integration type of the data to be used = 1, Box integration = 2, Profile fitted ISPTYP controls the handling of integrated reflections flagged as spatial overlaps = 0, Exclude spatial overlaps = 1, Include spatial overlaps = 2, Use only the spatial overlaps IFILM(1)...IFILM(NFILMS) is a list of NFILM film numbers for the films to be included. (1=Film A, 2=Film B etc. up to 6) Data Card 2 NSPGRP NCYC CONV KANAL DLAM NSPGRP is the space group number. The symmetry positions for this space group are read from the CCP4 symmetry operators file and the corresponding point group symmetry matrices are read from the CCP4 point groups file. NCYC is the number of cycles of refinement of the scale factors between the two Laue data sets. CONV. The scale factor refinement will be terminated if the parameter shifts are less than (CONV * the standard deviations of the shifts. KANAL is a flag controlling the printing of analyses of the film to film scaling. = 1, Print analyses against indices and S with graphs. = 0, Do not. DLAM. Pairs of measurements whose predicted wavelength difference exceeds DLAM are excluded from the processing. Data Card 3 FRMIN SDMIN FRCUT SDREJ1 SDREJ2 SDREJ3 NDIAG1 NDIAG2 NDTYP (In addition to the descriptions of the items below, see section 9.) FRMIN. A reflection is omitted from the inter-pack scale factor calculation for a given film if its intensity is less than or equal to FRMIN times the average intensity for the film. SDMIN. A reflection is omitted from the inter-pack scale factor calculation if its intensity IL is less than or equal to SDMIN*sig(IL). FRCUT. If, for either measurement of an unscaled pair of measurements, the intensity IL is less than FRCUT times the average intensity for the film then the reflection pair will be excluded from the calculation of the merged difference data. SDREJ1. For the scaled data, exclude a reflection pair from the calculation of the merged difference data if sig(IL1) > SDREJ1*sig(IL2) or sig(IL2) > SDREJ1*sig(IL1) SDREJ2. For the scaled data, exclude a reflection pair from the calculation if the merged difference data if sig(IL) > SDREJ2*(the lowest sig(IL) for the reflection) SDREJ3. As long as their are more than two measurement pairs remaining for a given reflection, remove the furthest outlier if |IL2 - ILmean| > SDREJ3*(average sd for all the scaled data) NDIAG1 is the number of reflections for which details are to be printed as they are read in from the GE file. A value of zero may be given if this diagnostic output is not required. The details printed are h k l IL sig(IL) x y lambda. NDIAG2 is the number of reflections for which full diagnostic details are to be printed giving details of the individual measurement pairs contributing to a reflection and indicating rejected measurements together with the reason for their rejection. If NDTYP is greater that zero then only those measurements with a rejection flag number greater than or equal to NDTYP are listed. For further details see section 7. NDTYP selects the classes of reflection to be printed if NDIAG2 is greater than zero. =0 include all measurements >0 include only rejected reflections with a rejection type greater than or equal to NDTYP. (Type 1 corresponds to rejection via the SDREJ1 test, type 2 corresponds to rejection via the SDREJ2 test and type 3 corresponds to rejection via the SDREJ3 test.) Data Card 4 SDIFF SDIFF2 SDIFF. Exclude reflections from the output file for which: Abs(difference) < SDIFF * sig(difference) SDIFF2. If SDIFF2 > 0.0, exclude reflections from the output file for which: Abs(difference) > SDIFF2 * sig(difference) Data Card 5 TITLE Title for the output MTZ file containing the difference data (max of 70 characters) Data Card 6 LABINitem1=name1 item2=name2 ... These are the MTZ assignments for the input MTZ reflection data file for the items H K L FP and optionally PHI and/or W where FP is is the reference 'F' value for normalisation and where PHI and W optional phase and figure of merit columns which will be passed on to the output file if the assignments are given. Data Card 7 LABOUTitem1=label1 item2=label2 ... These are optional label assignments for the output file for the items H K L FL1 SIGL1 FL2 SIGL2 and optionally PHI and/or W. (FL1, FL2 are the 'F' values for the data from Laue file 1 and 2 and SIGL1, SIGL2 are their sig(F) values) INPUT AND OUTPUT FILES The input files are: a) The control data file b) Two input Laue GE files. c) The input reflection file containing the standard reflection data to be used in the normalisation in standard MTZ format. This data file must be sorted on h, k and l and must contain the unique set of indices for the point group as produced by the standard data processing programs. The output files are: a) A reflection data file in standard MTZ format containing the scaled, normalised and merged Laue data. The output data items are H K L FL1 SIGFL1 FL2 SIGFL2 and optionally PHI and/or W. RUNNING THE PROGRAM Use the command 'laue difflaue' Parameters: DATA The control data file. LAUEGE1 The first input Laue data file. (.ge1 file) LAUEGE2 The second input Laue data file. (.ge1 file) HKLIN The input MTZ standard reflection data file. HKLOUT The output reflection data file containing the scaled, merged and normalised reflection data. NOTES None. PRINTER OUTPUT The printer output starts with details of the input control data, the symmetry matrices and the headers of the input and output MTZ files. For each pair of films to be included, details are given of the number of reflections read in and accepted/rejected. For diagnostic purposes details of the first NDIAG1 reflections read from each GE file may be listed if requested. Details of the film to film scaling process and analyses (if requested) of the scaled data (See user documentation of the program CCP4 program ANISOSC for further details) are then given. Diagnostics, giving details of individual measurements and rejections etc. for the first NDIAG2 reflections, are then printed. In the example following the table has been slightly re-formatted to reduce its width (in the printer output there are two columns of individual measurements per printed line instead of just one. Also the columns giving SDOBS and SDOUT have also been omitted). The section of the listing given contains an example of a reflection rejected for each of the three rejection types (via SDREJ1, SDREJ2, SDREJ3). H K L F1 F2 SDMRG NOBS NREJ (1 2 3) FILM FL1 SD1 FL2 SD2 IREJ 1 1 11 1350 1418 104 5 0 0 0 0 1 1350 15 1482 14 0 2 1350 18 1541 16 0 3 1350 18 1344 16 0 4 1350 25 1306 23 0 5 1350 33 1211 30 0 1 1 12 959 979 39 6 1 0 1 0 1 959 16 1022 14 0 2 959 17 985 15 0 3 959 19 950 17 0 4 959 22 985 21 0 5 959 28 886 25 0 6 959 177 1319 91 2 1 1 13 1451 1413 97 12 1 1 0 0 1 1451 17 1351 16 0 1 1451 22 1433 22 0 2 1451 20 1633 18 0 2 1451 24 1426 24 0 3 1451 21 1342 20 0 3 1451 27 1360 25 0 4 1451 28 1417 26 0 4 1451 30 1404 28 0 5 1451 38 1325 37 0 5 1451 35 1341 32 0 6 1451 79 1409 70 0 6 1451 162 1710 423 1 1 1 15 1807 1754 63 6 1 0 0 1 1 1807 26 1835 25 0 2 1807 24 1693 22 0 3 1807 26 1690 25 0 4 1807 29 1814 26 0 5 1807 33 1776 33 0 6 1807 65 1500 113 3 1 1 17 615 599 41 4 0 0 0 0 1 615 62 521 58 0 1 615 17 602 15 0 2 615 23 636 19 0 3 615 41 487 44 0 Key: H, K, L are the reflection indices. F1, F2 are the scaled and merged F values for the reflection from for packs 1 and 2. F1 is equal to the reference value. SDMRG is the standard deviation calculated for the merged F2 value based on the observed agreement of multiple values (0 if only a single pair of measurements present). The SDOBS value (column not shown) is a standard deviation for F2 (or the difference) based on the standard deviations of the F1 and F2 measurements derived from the integration. The output standard deviation for F2 (or the difference) SDOUT (column not shown) is taken as the larger of SDMRG and SDOBS. It is the SDOUT value which is used in the analyses of the differences as a function of standard deviation or to exclude reflections via the SDIFF and SDIFF2 cutoffs. The standard deviation of the reference F value is taken for the standard deviation of F1. NOBS is the number of scaled intensity measurements before the rejection criteria (1), (2) and (3) using SDREJ1, SDREJ2 and SDREJ3 are applied. NREJ (1 2 3) indicate the total number of measurements rejected for the reflection and the number of rejections for each rejection type. FILM is the number of the film within the pack (1 - 6). FL1, SD1 are the scaled Laue F value and its standard deviation for the pack 1 measurement (FL1 = the reference F value in each case). FL2, SD2 are the scaled Laue F value and its standard deviation for the pack 2 measurement. IREJ is the rejection flag equal to the rejection type 1-3 if the measurement pair was rejected or zero if the measurement pair was accepted. The number of Laue Measurements, matched and scaled to the reference data set and written to the output file, is printed. The numbers of reflections rejected by the various criteria are also given. e.g. Number of Laue measurements matched and scaled= 11813 Total number of measurement pairs rejected: 671 Number rejected using SDREJ1 cutoff: 20 Number rejected using SDREJ2 cutoff: 174 Number rejected using SDREJ3 cutoff: 477 A reflection distribution analysis then follows showing how the numbers of scaled and used reflection measurements are distributed across the various pairs of films. e.g. REFLECTION DISTRIBUTION ANALYSES Film1 Film2 Film3 Film4 Film5 Film6 Total No. of reflection pairs scaled: 6389 3181 1454 511 202 76 11813 (% ot total) 54.08 26.93 12.31 4.33 1.71 0.64 No. of reflection pairs used: 6331 3062 1172 397 156 24 11142 (% of total) 56.82 27.48 10.52 3.56 1.40 0.22 (% of those scaled 99.09 96.26 80.61 77.69 77.23 31.58 for the film) No. reflns with >= 1 used measurements: 3456 1745 788 288 126 22 6425 (% of total) 53.79 27.16 12.26 4.48 1.96 0.34 Two analysis tables are then given for differences as a function of standard deviation. e.g. TABLE OF NUMBERS OF REFLECTIONS IN RANGES OF ABS(DIFF.)/SIG(DIFF.) (|FL1-FL2|)/SIG(FL2) [>=LOWER BOUND, =5 NO. REFLECTIONS: 574 263 100 36 15 11 TABLE OF MEAN FRACTIONAL DIFFERENCE FOR ALL REFLNS. FOR WHICH (|FL1-FL2|)/SIG(FL2) >= N FOR VARIOUS N N: 0 1 2 3 4 5 MEAN FRACTIONAL DIFFERENCE: 0.099 0.171 0.232 0.292 0.360 0.474 NUMBER OF REFLECTIONS: 999 425 162 62 26 11 The numbers of reflection excluded from the output file and the number output are then listed. ERROR MESSAGES a) General syntax error in the control data **SYNTAX ERROR IN FIELD n ** text b) Errors in the control data **NUMBER OF FILMS MUST BE 1 TO 6** **FILM NUMBERS MUST BE 1 TO 6** **INTEGRATION TYPE MUST BE 1 OR 2** **SPATIAL OVERLAP TYPE MUST BE 0,1 OR 2** c) Errors in reading Laue data **NUMBER OF REFLECTIONS MAY NOT EXCEED n** **INDICES OUT OF RANGE -127 TO 127, HKL= h k l ** **SIGMA OUT OF RANGE 0-32767 FOR REFLECTION HKL= h k l SIGMA = isig ** **LAMBDA xxxx.x LESS THAN 0 OR GREATER THAN 327.6 ANGSTROMS** **NO LAUE REFLECTIONS ACCEPTED** d) Errors when scaling films **TOO FEW REFLECTIONS FOR SCALING, FILM OMITTED** **MAXIMUM CAPACITY OF n REFLECTIONS EXCEEDED** e) Too many measurements for a reflection **MAX OF n MEASUREMENTS EXCEEDED FOR REFLECTION h k l ** f) Other MTZ file handling errors Other error messages may be produced by the MTZ file handling routines. PROGRAM FUNCTION Two programs, LAUEDIFF and DIFFLAUE have been developed for processing the difference data from two film packs measured at different times (e.g. at two stages of a reaction) but otherwise under identical conditions and for the same crystal orientation. In both cases, the data from the two film packs are scaled together using a scaling function which is a scale factor (K) and an anisotropic temperature factor: K.exp(-2(h**2.B11+k**2.B22+l**2.B33+2.h.k.B12+2.h.l.B13+2.k.l.B23)) The data are then scaled to the reference (e.g. monochromatic) set of data and an output file of merged data is written. In LAUEDIFF, the two input Laue data sets are output files from the AFSCALE program in which the data from the individual films within each pack have been scaled and combined. The alternative strategy adopted in DIFFLAUE is to scale the data from each individual film in the pack to the corresponding film in the other pack and only to combine the data from each film at the final merging stage of the procedure. This strategy was designed to bypass possible inaccuracies in the intensities from AFSCALE resulting from the difficulty in handling the wavelength dependence of the inter film scaling function. The second method was also developed with facilities for selecting which measurements to include or reject during the processing and with optional diagnostic facilities allowing detailed examination of the individual contributions to agiven reflection. The scaling between the data from the two packs is done using slightly modified subroutines from the derivative to native data set scaling program ANISOSC from the CCP4 program suite. The scheme of calculation employed in the two programs is outlined in the flow chart given in the appendix at the end of this document. In DIFFLAUE, the wavelength difference allowed for defining a matching pair of reflections is set by the user. Only matching pairs of reflections, for which both measurements fulfil the following conditions, are included in calculating the scaling between the two packs for a given film: IL > FRMIN * ILave IL > SDMIN * sig(IL) where: 'IL' is the integrated intensity for a Laue reflection and sig(IL) is its standard deviation. 'ILave' is the average integrated intensity for the film. 'FRMIN' and 'SDMIN' are user selected constants (e.g. 1.0, 3.0) Measurements which are considered to be bad are not used in the calculation of the merged difference data for a reflection. The following rejection criteria are applied (in the order described): On the unscaled data omit a reflection pair if, for either measurement, IL < FRCUT * ILave where: 'IL' is any Laue intensity measurement. 'ILave' is the average Laue intensity for the film on which the reflection was measured. 'FRCUT' is a user defined value e.g. 0.25 Then on the scaled data omit a reflection pair if 1) sig(IL1) > SDREJ1 * sig(IL2) or sig(IL2) > SDREJ1 * sig(IL1) where: 'sig(IL1)' and 'sig(IL2)' are the standard deviations of Laue intensities from the the first and second pack respectively. 'SDREJ1' is a user defined value e.g. 3.0. 2) sig(IL) > SDREJ2 * (lowest sig(IL) value for the reflection) where: 'sig(IL)' is the standard deviation of a Laue intensity. 'SDREJ2' is a user defined value e.g. 6.0 3) As long as there are more than two measurement pairs remaining, remove the furthest outlier if |IL2 - ILmean| > SDREJ3 * sdave where: 'IL2' is a scaled Laue intensity from the second pack. 'ILmean' is the weighted mean IL2 value for the reflection. 'sdave' is the average standard deviation for all the scaled data. 'SDREJ3' is a user defined value e.g. 3.0 For each reflection the data from the accepted measurements are merged a a reflection record is written to the output MTZ file containing the items h, k, l, FL1, sig(FL1), FL2, sig(FL2) and optionally a phase and/or figure of merit. Reflections may be excluded from the output file based on mimimum and maximum cutoff values for abs(difference)/sig(difference). If a data set consists of several pairs of film packs then the data from the output files from DIFFLAUE may be merged using the program DIFFLMRG. REFERENCES 1) Hajdu J., Machin P.A., Campbell J.W., Greenhough T.J., Clifton I.J., Zurek S., Gover S., Johnson L.N. and Elder M. (1987) Nature 329178 EXAMPLES Example of the control data for preparing a set of difference data with phases. 6 2 0 1 2 3 4 5 6 96 4 0 1 0.020 0.3 3.0 0.1 3.0 6.0 3.0 50 200 0 DIFFLAUE ON 6&11, 2-8:JAN:86 LABIN FP=FO SIGFP=SIGFO W=MCMB PHI=PHCAL LABOUT FL1=F6 SIGL1=SIGFO FL2=F11 SIGFL2=SIGL OUTLINE FLOWCHARTS OF LAUEDIFF AND DIFFLAUE LAUEDIFF DIFFLAUE Perform initialisations and Perform initialisations and read in control data. and read in control data. . . . Loop through the (six) films . in a pack. . | Read in Laue data (AFSCALE | Read in Laue data for the output) for the first film | current film for the first pack and store it (unmerged) | film pack and store it in the array IA1. [RDLAU] | (unmerged) in the array IA1. . | [RDLAU] . | . Sort this data on h, k, l. | Sort this data on h, k, l. [REFSRT] | [REFSRT] . | . Read in Laue data (AFSCALE | Read in Laue data for the output) for the second film | current film for the second pack and store it (unmerged) | film pack and store it in the array IA2. [RDLAU] | (unmerged) in the array IA2. . | [RDLAU] . | . Sort this data on h, k, l. | Sort this data on h, k, l. [REFSRT] | [REFSRT] . | . Calculate initial scale | Calculate initial scale factor between the two | factor between the two sets [SCINIT] and refine | sets [SCINIT] and refine the scale and anisotropic | the scale and anisotropic temperature factors. [SCAREF] | temperature factors. Only . | the strong reflections, as . | defined by user selected . | criteria, are used in the . | the scaling. [SCAREF] . | . Convert indices to unique | Convert indices to unique set and store reflections | set and store reflections with IL1 and scaled IL2 | with IL1 and scaled IL2 in the array IA3. [SCAPLY] | in the array IA3 (adding . | to those from any previous . | films. [SCAPLY] . | . End of loop through the films. . . . . Sort Laue data in array IA3 Sort Laue data in array IA3 on h, k, l. [REFSRT] on h, k, l. [REFSRT] . . . . Scale each pair of Laue data Scale each pair of Laue data to the corresponding reference to the corresponding reference intensity, merge the data for intensity, merge the data for each unique reflection using each unique reflection using all the available data and only measurements passing a write a reflection data file a series of selection criteria containing the unique merged and write a reflection data difference data. [SCALOP] file containing the unique merged difference data. [SCALOP] Notes: IA1, IA2 and IA3 are three arrays used to hold reflection data in the program. Each uses 4 words per reflection. The names in square brackets are the names of the subroutines carrying out the functions described. IL1 and IL2 are the intensities of Laue reflection measurement from the first and second film packs respectively. These will be for combined measurements from all films of the pack for LAUEDIFF or from an individual film for DIFFLAUE.