UNIQUE (CCP4: Supported Program)

NAME

unique - Generate a unique list of reflections

SYNOPSIS

unique hklout foo.mtz
[Keyworded input]

DESCRIPTION

UNIQUE creates a unique list of reflections for a given unit cell with a given symmetry up to a specified high resolution limit. The output file can be used to complete a dataset (i.e. to give an MTZ file with all allowed reflections present whether or not data have been measured for them), and to give completeness information on the measured dataset. The procedure is as follows:

  1. Produce a list of the unique reflections in an MTZ file, using UNIQUE with the appropriate cell parameters, symmetry and resolution range;
  2. Use FREERFLAG to add a column of free-R flags to the MTZ file, to be used later for cross-validation;
  3. Add this column of free-R flags to the dataset MTZ file using CAD. Unmeasured data, i.e. reflections that are present in the column of free-R flags but not in the original dataset, are represented in the output of CAD as Missing Number Flag (MNF) entries.
  4. Run MTZDUMP on the output of CAD to get various statistics, including the number of missing data entries (i.e. MNFs) for each column of data. This gives the completeness of the dataset for the specified resolution range.

A script to perform steps 1 - 3 is provided in $CETC/uniqueify (see below), and an example of its use is given in $CEXAM/unix/runnable/unique-free-R. Note that this script only gives a high resolution limit to UNIQUE (see RESOLUTION keyword), and so the dataset is extended to the lowest possible resolution. This is the recommended practice.

The 'uniqueify' script is also a part of the Convert to MTZ & Standardise task in the Reflection Data Utilities of the CCP4 Graphical User Interface (CCP4I).

If a column of free-R flags is already present in the incomplete dataset, then a modified procedure should be followed:

  1. Produce a list of the unique reflections using UNIQUE;
  2. Merge the output of UNIQUE with the dataset using CAD;
  3. Use the COMPLETE option of FREERFLAG to complete the free-R column;
  4. Remove the surplus columns originating from UNIQUE using MTZUTILS;
  5. Use MTZDUMP again to analyse the dataset.

This sequence is performed using the -f switch of the $CETC/uniqueify script.

The old procedure using COMPLETE is now obsolete.

KEYWORDED INPUT

The various data control lines are identified by keywords. Only the first 4 letters of each keyword are necessary.

Compulsory
CELL, RESOLUTION, SYMMETRY.
Optional
DEFAULT, LABOUT, RUN/GO/END, TITLE.

CELL <a> <b> <c> [ <alpha> <beta> <gamma> ]

Specify the unit cell. At least 3 numbers must be entered. Alpha, beta and gamma default to 90.0.

RESOLUTION <Dmax>

Maximum resolution - either 4(sin theta/lambda)**2 or d in Angstrom. Unique reflections up to this limit are output to the MTZ file.

SYMMETRY <Space group name or number>

Symmetry of the output file.

DEFAULT <default>

<default> is a real number or missing data value (NaN) output in the F and SIGF columns. It defaults to NaN.

TITLE <string>

Title on the printer output and output MTZ file

LABOUT <Proglab>=<Userlabel> ...

Specify output column labels.

The default column labels are H K L F SIGF, where F and SIGF have dummy values <default> (see DEFAULT keyword).

RUN | GO | END

Terminates keyworded input and runs the program.

INPUT AND OUTPUT FILES

The output file is a reflection data file in standard MTZ format (i.e. one record/reflection) containing 5 items per reflection (see the LABOUT keyword for labels used).

The F and SIGF columns all take the <default> value (see the DEFAULT keyword).

PRINTER OUTPUT

The printer output starts with details of the control data and the symmetry. Limits of the Miller indices are printed for this resolution range. Details of the output MTZ file followed by the total number of reflections tested and written out end the output.

PROGRAM FUNCTION

The program UNIQUE reads in control data and calculates a reciprocal cell. From this cell the range of Miller indices for the required resolution range is calculated. The program then loops through each potential reflection and tests whether it satisfies the limiting conditions for this Laue group and/or whether the reflection is a systematic absence before outputting to the MTZ file.

uniqueify SCRIPT

The full syntax of $CETC/uniqueify is:

uniqueify [-s] [-f <label> | -p <fraction>] <input file>[.mtz] [<output file>]

-s
Keep systematic absences in the output MTZ file.
-f <label>
If your dataset already contains a free-R column you must specify this switch and give the label of the free-R column as it appears in the input MTZ file. uniqueify will deduce the style and range of flags used and preserve them when completing the free-R column.
-p <fraction>
If your dataset does not already contain a free-R column then you may specify the fraction of reflections to be tagged with each free-R indicator. <fraction> (default 0.05) is passed as the argument to the FREERFRAC keyword of FREERFLAG (see FREERFLAG documentation).

A VMS version ($CETC/UNIQUEIFY.COM) is also provided.

SEE ALSO

freerflag, cad, mtzdump

AUTHOR

A.G.W.Leslie

EXAMPLES

Producing a set of reflection data

     unique HKLOUT x_unq.mtz << EOF
     TITLE  Unique data for protease
     LABOUT  F=FP SIGF=SIGFP
     SYMM P212121
     RESOL 1.40
     CELL 40.0 50.0 71.0
     EOF

Statistics of completeness on a set of measured data

     #! first make the unique data
     #
     unique hklout x_unq.mtz <<eof-unique
     TITLE  Unique data for protease
     LABOUT  F=FP SIGF=SIGFP
     SYMM P212121
     RESOL 1.40
     CELL 40.0 50.0 71.0
     eof-unique
     #
     #! Now add free-R column
     #
     freerflag HKLIN x_unq.mtz HKLOUT x_unq2.mtz <<eof-freerflag
     END
     eof-freerflag
     #
     #! Now merge the free-R column of the unique file with the 
     #! measured data
     #
     cad HKLIN1 x_unq2.mtz HKLIN2 p14_tru.mtz 
         HKLOUT p14_tru_complete.mtz << eof-cad
     LABI FILE 1  E1=FreeR_flag
     LABI FILE 2  ALLIN
     END
     eof-cad
     #
     # ! Now run the merged file through MTZDUMP
     #
     mtzdump HKLIN p14_tru_complete.mtz << eof-mtzdump 
     NREF 100  
     END
     eof-mtzdump