XDLDATAMAN (CCP4: Deprecated Program)
NAME
xdldataman
- X-windows tool; manipulation, analysis and reformatting of reflection files.
SYNOPSIS
xdldataman
[-font1 | font2 | font3 | font4 | font5]
[menu-driven command selection; interactive parameter input]
xdldataman can be used to read, write, analyse and manipulate ASCII
reflection files from most biomacromolecular refinement packages
packages. No binary reflection file types are supported (i.e.:
MTZ files can *not* be read).
The program runs in interactive mode only. It uses the XDL_VIEW
toolkit of J.W. Campbell to provide an easy-to-use interface.
Commands are selected by clicking on the desired menu option;
some menu options give a pop-up sub-menu with further options
(indicated by "-->" in the menu name).
File formats are selected with pop-up menus; all other parameters
are set in pop-up dialogue boxes (cut-and-paste is supported).
In most cases, default values are given in [square brackets].
To accept these defaults, hit the RETURN key. If multiple numbers
are to be input (e.g., cell constants), and if only the first one
needs to be changed, for instance, typing the new value for this
first number followed by the RETURN key will preserve the values
for the other five numbers.
There is a command line option (-font?) which will determine the
size of the menu font. These fonts refer to the xdl fonts which are
defined from 1 to 5. This can be useful if the window size is too
large for the screen. The default font size is 2. The font definitions
can be changed in the .Xdefaults file or xrdb, however all xdl
programs will be then be effected.
Output from the program is written to a separate area of the main
window. Output can be scrolled and cut and pasted into other
documents.
For lengthy operations a progress bar shows how much of the
operation has been completed.
HISTORY
xdldataman is a CCP4 special version of DATAMAN (part of the Uppsala
RAVE averaging package). This version is entirely interactive and
has a user-friendly interface. However, this version can only handle
one dataset at a time, and some of the functionality of the parent
program is absent.
DATAMAN was originally written as a simple format-exchange program,
to convert MTZDUMP files to X-PLOR
reflection files. It has grown
quite a bit since then to include other formats and to carry out
several everyday manipulations on datasets. It also includes
several programs that were previously stand-alone jiffies (such as
the Gemini twin analysis command).
Some of the operations are implementations of Gerard Bricogne's
algorithms as described in Volume B of the International Tables.
MENU OPTIONS
Commands are issued by moving the pointer over the desired menu
option and clicking with the left mouse button.
READ A DATASET
Provide the filename and then select the appropriate file type
from the pop-up menu. See FORMATS.
LIST DATASET INFORMATION
This prints some information regarding the dataset that is currently
in memory.
STATISTICS OF THE DATASET
This lists the mean, standard deviation, minimum and maximum values
of all properties of the dataset that are known.
EFFECTIVE RESOLUTION
This calculates an estimate of the effective resolution of the
dataset, defined as "the resolution at which the actual number of
reflections in the current dataset would constitute a 100% complete
dataset" (B. Hazes, personal communication). This number is listed
for all reflections, and for all reflections with F > n * sigma(F),
with n=1,...,5.
The lattice type and the number of asymmetric units need to be
provided (they are used to estimate the volume of reciprocal space
covered by the data).
RSYM HKL/KHL REFLECTIONS
This calculates Rsym (on Fs and on Is, assuming that I=F*F) for
all reflection pairs HKL and KHL. If the spacegroup is, for instance,
P3x or I4x and this Rsym is low, there may be a spacegroup error
(e.g., the spacegroup is P41212 instead of P41).
CELL CONSTANTS
Enter the cell constants (needed to calculate the resolution of the
reflections).
SYMMETRY OPERATORS
Provide the name of a symmetry-operator file in O format.
CALCULATE/DEDUCE/CONVERT
This command has several sub-options:
- RESOLUTION
- Calculate the resolution of all reflections.
- CENTRICS/ACENTRICS
- Assign centric/acentric flag to each reflection.
- ORBITAL MULTIPLICITY
- Calculate the orbital multiplicity of each reflection.
- F -> I CONVERSION
- Convert Fs to Is by using: I ~ F*F.
- I -> F CONVERSION
- Convert Is to Fs by using: F ~ sqrt(I) (I>=0).
TYPE SOME REFLECTIONS
Use this command to list some reflections. Provide the number of the
first and last reflection and the step size (e.g., step size 10 will
list every 10-th reflection). Providing a negative number for the last
reflection is taken to mean the actual last reflection. Providing a
value of zero for the step size will print only the first and the last
reflection. Providing a negative step size means "show N reflections
equally spread between the first and the last", with N being -1 * the
step size.
SHOW SOME REFLECTIONS
This command has several sub-options:
- FOBS >
- Provide a number; all reflections with F greater than this number
will be listed.
- FOBS <
- Provide a number; all reflections with F smaller than this number
will be listed.
- SIGMA >
- Provide a number; all reflections with sigma(F) greater than this
number will be listed.
- SIGMA <
- Provide a number; all reflections with sigma(F) smaller than this
number will be listed.
- F/SIGMA >
- Provide a number; all reflections with F/sigma(F) greater than this
number will be listed.
- F/SIGMA <
- Provide a number; all reflections with F/sigma(F) smaller than this
number will be listed.
- RESOLUTION >
- Provide a number; all reflections with a resolution greater than this
number will be listed (note: greater means "lower resolution" !).
- RESOLUTION <
- Provide a number; all reflections with a resolution smaller than this
number will be listed (note: smaller means "higher resolution" !).
SPECIAL REFLECTIONS
List all reflections of a certain type. Provide a template of the type
of reflections to be shown (containing the characters H, K, L and/or 0).
For example: HHH, 0K0, KK0, etc.
ABSENCES LIST
List reflections that are systematically absent according to the
current spacegroup symmetry operators. This can sometimes be used
to make educated guesses concerning the nature of certain screw
axes (e.g., in P4x, if only 00L, with L=4N, are strong reflections
x is probably 1 or 3).
TWIN STATISTICS
Print some intensity statistics that may or may not be able to
provide information with respect to possible twinning.
GEMINI TWIN ANALYSIS
This implements intensity analysis options as described by Stanley
(1972) and Rees (1980), that may be of help in investigating
possible merohedral twinning. The output consists of statistics
and an estimate of the most likely twin fraction ("0.0" means no
merohedral twinning). In addition, two PostScript files are
produced showing 1N(z,alpha) as a function of z for centro- and
non-centrosymmetric reflections. See the original papers for more
information.
TEMPERATURE FACTOR APPLY
Apply a temperature factor to the Fs.
CHANGE INDEX
Re-index data. This may be necessary when the data-processing
program yields a cell with beta < 90 in a monoclinic spacegroup,
or when two datasets cannot be merged due to indexing along
equivalent, but different axes (e.g., in P3x, P4x, etc.).
Provide expressions for the new H, K and L.
PROD/PLUS
This command has several sub-options:
- FOBS
- Provide two numbers X and Y; all Fs will be replaced by X*F+Y.
- SIGMA(FOBS)
- Provide two numbers X and Y; all sigmas will be replaced by
X*sigma+Y.
- BOTH
- Provide two numbers X and Y; all Fs and sigmas will be replaced by
X*F+Y and X*sigma+Y, respectively.
LAUE GROUP APPLY
Move the reflections into the asymmetric unit appropriate for the
Laue group of the dataset. This is sometimes necessary when the
data-processing program outputs a non-standard asymmetric unit
(for instance, R-AXIS processing software in P4x gives an asymmetric
unit which is incompatible with the CCP4 standard).
The Laue group is selected from a pop-up menu.
Implemented Laue groups and their asymmetric units are:
1bar, hkl:h>=0 0kl:k>=0 00l:l>=0
1bar, hkl:k>=0 h0l:l>=0 h00:h>=0
1bar, hkl:l>=0 hk0:h>=0 0k0:k>=0
2/m, hkl:k>=0, l>=0 hk0:h>=0, k>=0
2/m, hkl:h>=0, l>=0 0kl:k>=0, l>=0 (2-nd sett)
mmm, hkl:h>=0, k>=0, l>=0
4/m, hkl:h>=0, k>0, l>=0 with k>=0 for h=0
4/mmm, hkl:h>=0, h>=k>=0, l>=0
3bar, hkl:h>=0, k<0, l>=0 including 00l
3bar, hkl:h>=0, k>0 including 00l:l>0
3barm, hkl:h>=0, k>=0 with k<=h; if h=k l>=0
6/m, hkl:h>=0, k>0, l>=0 with k>=0 for h=0
6/mmm, hkl:h>=0, h>=k>=0, l>=0
m3, hkl:h>=0, k>=0, l>=0 with l>=h, k>=h for l=h, k>h if l>h
m3m, hkl:k>=l>=h>=0
SORT REFLECTIONS
Sort the reflections by their indices H, K and L. The sort order
is determined by the user.
KILL SOME REFLECTIONS
This command has the same sub-options as SHOW SOME REFLECTIONS:
- FOBS >
- Provide a number; all reflections with F greater than this number
will be deleted.
- FOBS <
- Provide a number; all reflections with F smaller than this number
will be deleted.
- SIGMA >
- Provide a number; all reflections with sigma(F) greater than this
number will be deleted.
- SIGMA <
- Provide a number; all reflections with sigma(F) smaller than this
number will be deleted.
- F/SIGMA >
- Provide a number; all reflections with F/sigma(F) greater than this
number will be deleted.
- F/SIGMA <
- Provide a number; all reflections with F/sigma(F) smaller than this
number will be deleted.
- RESOLUTION >
- Provide a number; all reflections with a resolution greater than this
number will be deleted (note: greater means "lower resolution" !).
- RESOLUTION <
- Provide a number; all reflections with a resolution smaller than this
number will be deleted (note: smaller means "higher resolution" !).
ERASE OPTIONS
This command has the several sub-options:
- ROGUES
- Delete "rogue" reflections simply by providing their HKL indices.
This can be used to remove individual reflections which are suspect
for some reason or other.
- ODD H/K/L
- Delete all reflections for which either H, K or L is odd.
- EVEN H/K/L
- Delete all reflections for which either H, K or L is even.
RFREE OPTIONS
This command has the several sub-options:
- INITIALISE
- Set the seed for the random-number generator. Providing a negative
seed will use the current machine clock value as the seed; a positive
number will be used itself as the seed.
- LIST CURRENT STATUS
- This lists the current partitioning of the dataset in WORK and TEST
reflections. If there are very few or very many TEST reflections,
a warning message will be printed. In general, 10% of the data with
a minimum of ~500 and a maximum of ~2000 TEST reflections is
considered to be reasonable. Note that the error in Rfree is roughly
equal to 1/SQRT(number of TEST reflections), so that for 500 TEST
reflections the error is ~4.5% and for 2000 TEST reflections ~2.2%.
- RESET ALL RFREE FLAGS
- This sets all Rfree flags to zero, i.e. all reflections are flagged
as WORK reflections. See RFREE FLAGS.
- GENERATE RANDOM RFREE FLAGS
- Provide either the *percentage* or (roughly) the *number* of TEST
reflections. Randomly chosen reflections will be assigned as TEST
reflections (the same way X-PLOR does this). This is actually the
worst possible way to select TEST reflections, since (through the
G-function) every reflection will be related to its neighbours
(in reciprocal space) and, in the case of NCS, to its "NCS mates"
and their neighbours.
- SHELLS OF RFREE REFLECTIONS
- Provide the *percentage* or (roughly) the *number* of TEST reflections
and the number of resolution bins. The data will be sorted by
resolution and divided into bins. From every bins the appropriate
fraction of reflections from its centre will be flagged as being TEST
reflections. This is to counter couplings in the case of NCS.
- SPHERES OF RFREE REFLECTIONS
- Provide the *percentage* or (roughly) the *number* of TEST reflections
and the radius of reciprocal-space spheres. Reflections are picked
at random, and they and their neighbours inside a small sphere (in
reciprocal space) are all assigned as TEST reflections. This is to
counter couplings due to bulk solvent in the absence of NCS.
- GSHELDRICKS METHOD
- This simply assigns every N-th reflection to be a TEST reflection,
where the value of N is provided by the user (in SHELX, N=10).
- COMPLETE CROSS-VALIDATION SETS
- The number N of datasets to be generated is provided. Every reflection
will be assigned to be a TEST reflection in exactly one of the N
datasets. The N datasets are written in X-PLOR format (but can, of
course, be converted into other formats with this program).
WRITE DATASET TO FILE
Provide the filename and select the desired file type from the pop-up
menu.
DELETE CURRENT DATASET
Remove the current dataset from memory.
HELP
This prints some brief information. Subsequently, click on *any*
menu command to get a brief explanation of what that command does.
QUIT
Stop working with the program.
Supported input formats :
-
- * (free format)
- MTZDUMP (user or free format)
- XPLOR (no format required)
- SHELXS (fixed format)
- TNT (free format)
- PROTEIN (user or free format)
- MKLCF (user or free format)
- HKLFS (user or free format)
- RFREE (user or free format)
- ELEANOR (user or free format)
Supported output formats :
-
- * (fixed format)
- RXPLOR (no format required)
- SHELXS (fixed format)
- TNT (fixed format)
- PROTEIN (user or fixed format)
- CIF (user or fixed format)
- MKLCF (user or fixed format)
- HKLFS (user or fixed format)
- RFREE (user or fixed format)
- ELEANOR (user or fixed format)
- XPLOR (no format required)
Notes on formats :
-
- */HKLFS - reads/writes HKL F Sigma
- RXPLOR - X-PLOR format with Rfree flags
- SHELXS - fixed format, no Rfree flags
- TNT - no FOMs, phases or Rfree flags
- PROTEIN - no Sigmas
- MKLCF - integer F and Sigma
- RFREE - HKL F Sigma and integer Rfree flags
- ELEANOR - ditto, but real (1.0-Rfree) flags
- MTZDUMP - reads unedited MTZDUMP log file
- CIF - output only; Rfree flags UNofficial
- Use the Calculate option to convert I<->F if needed.
RFREE FLAGS
xdldataman uses the X-PLOR convention, i.e. the Rfree flag is an
integer number (0 or 1), and a value of "1" means that the reflection
belongs to the TEST set which is *not* used in refinement.
CCP4 has a different convention: reflections are divided into a number
of equal-sized sets, one of which (usually flagged "0") represents
the TEST set, see program FREERFLAG.
The CCP4 convention is supported (and converted) through the "ELEANOR" format.
- XDLDATAMAN:
G.J. Kleywegt & T.A. Jones (1996), Acta Cryst. D52, 826-828.
- XDL_VIEW:
J.W. Campbell (1995). "XDL_VIEW, an X-windows-based toolkit for
crystallographic and other applications", J. Appl. Cryst. 28, 236-242.
- RAVE:
G.J. Kleywegt & T.A. Jones (1994). "Halloween ... Masks and Bones",
in "From First Map to Final Model" (S. Bailey, R. Hubbard & D. Waller,
Eds.), SERC Daresbury Laboratory, pp. 59-66.
- O:
T.A. Jones, J.Y. Zou, S.W. Cowan, & M. Kjeldgaard (1991). "Improved
methods for building protein models in electron density maps and the
location of errors in these models", Acta Cryst. A47, 110-119.
- GEMINI:
E. Stanley (1972). "The identification of twins from intensity
statistics", J. Appl. Cryst. 5, 191-194.
- GEMINI:
D.C. Rees (1980). "The influence of twinning by merohedry on
intensity statistics", Acta Cryst. A36, 578-581.
- RFREE:
A.T. Brunger (1992). "Free R value: a novel statistical quantity for
assessing the accuracy of crystal structures", Nature 355, 472-475.
- CCP4:
Collaborative Computational Project Number 4 (1994). "The CCP4 suite:
programs for protein crystallography", Acta Cryst. D50, 760-763.
KNOWN BUGS
None (at the time of writing).
If you improve the program, please notify GJK of your changes so that
they can be implemented in future versions and the entire community
may benefit from them (E-mail a brief description and the relevant
pieces of code to "gerard@xray.bmc.uu.se").
AUTHORS
Originators: G.J. Kleywegt & T.A. Jones, Uppsala
SEE ALSO
xdlmapman,
mtzdump