ARP_WATERS (CCP4: Deprecated Program)
NAME
ARP_WATERS (ARP/wARP v5.0)
- Automated Refinement Procedure for refining protein structures.
SYNOPSIS
arp_waters XYZIN
foo_in.brk
MAPIN1
foo_2fofc.map
MAPIN2
foo_fofc.map
XYZOUT
foo_out.brk
[Keyworded input]IDENTIFICATION
Automated Refinement Procedure
Version 5.0
User Guide
This CCP4 distribution is not the full distribution of the ARP/wARP suite, and
includes only the programs arp_waters (which is actually version 5.0 of the
arp_warp program), prepform, prepshel and
t_shift, and the script arp_waters_plots.sh (renamed from
arp_warp_plots.sh).
The complete ARP/wARP package contains additional automated scripts and alpha
versions of new programs (for automated building of protein structures in electron
density maps; see "Automated protein model building combined with iterative structure
refinement" Perrakis, A., Morris, R.J.H. and Lamzin, V.S., Nature Struct. Biol.
6 (1999) 458-463), and is freely available to academic users from the ARP/wARP
homepage, http://www.arp-warp.org.
Industrial users are asked to contact the
authors for a license agreement.
The version of ARP distributed by CCP4 also contains minor
changes which enable the writing of "summary tags" into the program output -
see the libhtml documentation for details of these
tags (and how to suppress them!). Please note that these changes do not in
any way affect the running of the program, and are purely cosmetic.
In addition this version of ARP is substantially older than the current
version distributed by EMBL, and is retained only for the purposes of adding
waters (hence the change of name). Details of the current ARP/wARP suite
(including how to get it) can be found at the ARP/wARP homepage,
http://www.arp-warp.org/.
Contents
Introduction
The Automated Refinement Procedure, ARP_WATERS, is a program package for
protein structure refinement.
It combines in an iterative manner the reciprocal space structure factor
refinement with updating of the model in real space. The latter attempts to
mimic and automate a typically time extensive model rebuilding
session at the graphics.
The real space update is based on identifying and removing poorly defined
atoms and the addition of potential new sites.
This utilises some general shape properties of the electron density syntheses
as well as stereo-chemical criteria.
The ARP)WATERS (actually ARP/wARP version 5.0) can be used in the following ways:
- 1.
- Refinement of MR solutions
- 2.
- Improvement of MAD and M(S)IR(AS) phases
- 3.
- Averaging of multiple refinements
- 4.
- Automatic tracing of the density map and model building (not available in CCP4
version)
- 5.
- Building of the solvent structure
- 6.
- Ab initio structure determination for metalloproteins at very high resolution
For a more detailed description of the ARP see the references
given below.
The ARP/wARP procedure requires the use of reciprocal space refinement, density map
calculation and the
ARP/wARP software itself. The least-squares minimisation can be done with the CCP4
programs PROTIN / REFMAC with an
optional additional scaling (e.g. using RSTATS). Use of
other programs for least-squares minimisation, e.g. SHELXL, requires additional
conversion to the CCP4 format which is provided within the ARP_WATERS package. Density
map calculations are carried out with the CCP4 programs FFT
and MAPMASK.
Author information
Users are requested to report any bugs or suggested changes to the authors.
Victor S. Lamzin
EMBL Hamburg Outstation,
c/o DESY, Notkestrasse 85,
22603 Hamburg, Germany
Tel. +49-40-89902-121, Fax +49-40-89902-149,
E-mail victor@embl-hamburg.de
|
Anastassis Perrakis
EMBL Grenoble Outstation,
c/o ILL, Avenue des Martyrs, B.P. 156,
38042 Grenoble CEDEX 9, France
Tel. +33-476-207632, Fax +33-476-207199,
E-mail perrakis@embl-grenoble.fr
|
References
Any application of ARP_WATERS should actually refer to ARP/wARP version5.0,
and should cite a relevant publication (see the reference):
- ARP93 The original paper describing ARP
- ARP97 Elaborated analysis of the power and limitations of ARP
- wARP97 The original paper describing wARP
- ApARP96 An application of ARP to crystal structure refinement
- wARP98 Elaborated analysis of wARP and its application
- 1
-
V. S. Lamzin and K. S. Wilson.
Automated refinement of protein models.
Acta Cryst., D49:129-149, 1993.
- 2
-
V. S. Lamzin and K. S. Wilson.
Automated refinement for protein crystallography.
Methods in Enzymology, 277:269-305, 1997.
- 3
-
A. Perrakis, T. K. Sixma, K.S. Wilson, and V. S. Lamzin.
wARP: improvement and extension of crystallographic phases by
weighted averaging of multiple refined dummy atomic models.
Acta Cryst., D53:448-455, 1997.
- 4
-
D. Pignol, C. Gaboriaud, J. C. Fontecilla-Camps, V. S. Lamzin, and K. S.
Wilson.
How to escape from model bias with a high resolution native data set
- structure determination of the PcpA-S6 subunit III.
Acta Cryst., D52:345-355, 1996.
- 5
-
E. J. Asselt van, A. Perrakis, K. H. Kalk, and V. S. Lamzin.
Accelerated X-ray structure elucidation of a 36 kDa
muramidase/transglycosylase using wARP.
Acta Cryst., D54:58-735, 1998.
Acknowledgements
The authors are especially grateful to:
- Keith S. Wilson (York, UK) one of the originators of the software;
- Zbyszek Dauter (Brookhaven, USA) and Richard Morris (EMBL Hamburg, Germany) for
significant contributions to the software development;
- Eleanor Dodson (York, UK), Jozef Sevcik (Bratislava, SLO), Phil Evans
(Cambridge, UK), Susanna Butterworth (York, UK), Titia Sixma (NKI Amsterdam,
The Netherlands) and Erik van Asselt (Univ. Groningen, The Netherlands)
for valuable suggestions.
Using ARP_WATERS
Applications
The areas of application of ARP_WATERS (actually ARP/wARP Version 5.0) include:
- 1.
- Refinement of MR solutions
If the initial model (a Molecular Replacement solution) needs to be substantially
improved then unrestrained xyzB reciprocal space refinement may be carried out with
ARP/wARP performing updating of the whole model. Resolution of the data should be
2.0 Å or higher. The output is a set of ARP atoms (the ARP model).
The (3F_o-2F_c / 2mF_o-DF_c, \alpha_c) map should be calculated from the ARP model
and analysed carefully (yes, it's graphics time). The initial or the ARP model is then
rebuilt to fit this map. Very often, if the X-ray resolution is high enough and the
initial model is not completely wrong, the ARP atoms are located at approximately
the true protein atom positions even in the case of unrestrained refinement.
So they can be quite happily used as guides for rebuilding.
Please note, that for difficult cases approaches such as described for application
#4 might work better even when starting from a molecular replacement solution.
- 2.
- Improvement of MIR(AS) phases
ARP/wARP can be used to build a protein-like model consisting of a set of
non-connected atoms (free atoms model) into a MIR map.
This model is then refined as described above for #1.
- 3.
- Averaging of multiple refinements
ARP/wARP can be used to prepare models and command scripts for several independent
refinement runs as described for #1 and #2. The results are then processed in such
a way that each reflection is given a weighted average phase,
alphawARP, and a figure of merit, FOMwARP.
The results, especially for modest resolution, are better compared to a single
ARP/wARP refinement. The (F_o, alphawARP, FOMwARP)
map is then calculated and should be inspected. Resolution of the data should be
2.3 Å or higher.
- 4.
- Automatic tracing of the density map and model building
This is not available as part of the CCP4 distribution of ARP/wARP. Please
visit the ARP/wARP homepage at
http://www.arp-warp.org to obtain
the full distribution from the authors.
- 5.
- Building of the solvent structure
If the initial model is more or less correct, i.e. an R factor of about 30 % or less,
and essentially only the solvent needs to be improved, restrained (standard) reciprocal
space refinement is carried out with ARP/wARP performing automatic adjustment of the
solvent structure. Resolution of the data should be 2.5 Å or higher.
The output is the protein model with the solvent molecules transformed with symmetry
operations to lie close around the protein. The (3F_o-2F_c / 2mF_o-DF_c, alpha_c)
and (F_o-F_c / 2mF_o-DF_c, alpha_c) maps should be inspected.
- 6.
- Ab initio structure determination for metalloproteins
ARP/wARP was successfully applied to the small, 52 amino acid protein rubredoxin.
This structure could be solved ab initio.
The success was clearly due to the the presence of the FeS4 cluster in the
protein.
The positions as derived from the Patterson synthesis were used as a starting model.
This initial model gave an R factor of 53% at 0.92 Å resolution. The
resulting ARP model gave an R factor of 16% and map correlation to the final model
map of 90%. Subsequently the successful solution was obtained with
X-ray data truncated to 1.6 Å.
Model and Data Requirements
Quality of initial model
As the ARP/wARP real space update of the model is carried out on the basis of
electron density maps calculated with model phases, the starting model for the
refinement should be reasonable. The higher the resolution of the native
dataset the less reasonable the starting model can be: if you have 1
Å data for a metalloprotein, a reasonable model is the metal itself.
Quality of X-ray data
The data normally should be of high resolution. Unrestrained xyzB refinement
with ARP/wARP at lower resolution can potentially lead to a poorer quality density map.
The X-ray data should be complete, especially in the low resolution range
(5 Å and lower).
If the low resolution strong data are systematically incomplete (e.g. missing or
overloaded reflections), the density map, even in the case of a good model, is
usually discontinuous and is inconsistent with the model.
Because ARP/wARP involves updating on the basis of density maps, such discontinuity
can lead to incorrect interpretation of the density and as a result to slow
convergence or even non-interpretable maps.
In general, the number of X-ray reflections should be at least 6 times higher
than the number of atoms in the model.
Limitations
As ARP/wARP runs in conjunction with programs of the CCP4 suite all limitations
of the latter remain.
ARP/wARP itself is limited to:
- 1.
- The CCP4 conventions should be set up before running ARP/wARP
- 2.
- Density maps and reflection MTZ files in the CCP4 format
- 3.
- Maximum map section size is 400,000 points. The maximum number of map sections is
1,000. The maximum number of atoms in extended real space asymmetric unit is 250,000
- 4.
- Only acentric space groups (typical for proteins) and P1 are supported
- 5.
- ARP/wARP operates with coordinate files in the standard PDB format
Automated Scripts
The full distribution of ARP/wARP contains a number of automated
scripts which are designed to help avoid mistakes and generally improve
the user-friendliness of the programs. These scripts are not provided
with the CCP4 distribution of ARP/wARP (which is any case substantially older
than the current release of ARP/wARP) and so if you want to use them
you will need to obtain the full distribution from the ARP/wARP homepage
at http://www.arp-warp.org/.
Supplementary Use of ARP_WATERS
After restrained refinement is complete and before using the graphics it is worth knowing
which parts of the model should be corrected.
ARP_WATERS can be used for this purpose.
arp_waters XYZIN input.BRK MAPIN1 3Fo-2Fc.MAP XYZOUT temp << eof
MODE UPDATE ALLATOMS
CELL number number number number number number
SYMMETRY number/string
RESOLUTION number number
REMOVE ATOMS 50 CUTSIGMA 1.0
END
eof
The output of this job will contain a list of the 50 worst (from ARP/wARP 's point of view) atoms which do not agree with the electron density. These atoms should be inspected first. The input MAPIN1 should be the (3F_o-2F_c / 2mF_o-D_Fc,alpha_c) map.
Updating Old Command Files
If you have a working command file from a previous
release just change the ARP part to look like this:
arp_waters XYZIN input_coordinates MAPIN1 3Fo-2Fc_map_file \
MAPIN2 Fo-Fc_map_file_name XYZOUT output_coordinates << eof
MODE UPDATE ALLATOMS/WATERS
[CELL cell parameters]
[REFINE waters/allatoms]
SYMM spacegroup
RESOLUTION resmin resmax
FIND ATOMS number CHAIN string CUTSIGMA number/AUTO
REMOVE ATOMS number CUTSIGMA number [MERGE number] [KEEP ZEROOCC]
END
eof
Keyworded input to ARP_WATERS
The ARP_WATERS input is keyworded. For example to give the cell parameters to
the program we use the keyword CELL followed by the actual numbers, for
instance CELL 40.86 52.34 87.69 90 90 90
An input card may also be followed by a number of subkeywords (this should become
clear on further reading). The first keyword in a file MUST BE
MODE and the last one MUST BE END. Other keywords
may appear in any desired order. The order of the subkeywords has no restrictions.
Different ARP/wARP modes,
require different input files and different keywords. Examples are given below.
The slash symbol (/) separates alternative subkeywords.
Only the first four characters of each keyword or subkeyword (except END)
are needed to actually identify it.
The available keywords are:
MODE, CELL,
SYMMETRY, RESOLUTION,
FIND, REMOVE,
REFINE, MIRBUILD,
SHAKEMODEL, LABIN,
LABOUT, END
The Keywords
MODE |
Must be the first keyword.
update allatoms/waters initialises the update
mode. allatoms indicates that both protein and water atoms from the model
will be considered for update. waters indicates that only water atoms
(residue name HOH or WAT) will be updated. Metals will be treated as non-water atoms.
The distance constraints for the addition of new atoms are: the shortest distance
between new atom and any of the existing atom is 1.0 Å (allatoms)
and any of the O or N of the existing atoms (waters) is 2.3 Å; the
longest distance is 3.3 Å in both cases.
The distance constraint for removal is 3.5 Å or longer to any of the existing
atoms.
Partially occupied atoms will not be used for merge, their occupancy is
accounted for in removal. These atoms are used anyway as seeds (parent atoms) for the
new atom search.
mirbuild initialises the mirbuild mode. The pseudo protein set of
atoms will be placed into the input density map. The distance constraints are 1.1 to 1.8
Å between the atoms.
shakemodel light/allatoms initialises the shaking mode for
a shock-like modification of the current model. light indicates that only atoms
with atomic number 8 (oxygen) or lower will be treated. allatoms indicates
application to any atom in the model regardless of their type.
reflaver initialises the mode of weighted averaging
of structure factors obtained from multiple refinements of several slightly different
models. |
CELL |
Cell parameters a, b, c, alpha, beta, gamma in Å and degrees.
This keyword is optional for MODE update allatoms/waters and shakemodel
light/all atoms and is obligatory for MODE mirbuild.
|
SYMMetry |
The crystal symmetry. Can be given either as a space group name or number
(e.g. P212121 or 19). Obligatory for MODE update allatoms/waters,
mirbuild and shakemodel light/all atoms.
|
RESOlution |
Resolution of the X-ray data (Rmin, Rmax).
Obligatory for MODE update allatoms/waters, mirbuild and
reflaver.
|
FIND |
The addition of new sites in MODE update allatoms/waters.
After atoms you should give the number of atoms to add.
At the end of refinement (it may take 20 to 50 cycles) the model should contain all
atoms. The target number of atoms in the final
model can be estimated by multiplying the number of protein atoms by 1.2, the 20% extra
corresponds both to ordered water molecules and weaker, slightly disordered, ones which
are important for the pseudo
solvent continuum. The number of atoms allowed to be added in each cycle
depends on the resolution. A simple empirical guide is that the maximum number to add
is N X 0.08/d3,
where N is the current number of atoms and d is the highest resolution
in Å.
Thus at a resolution of 1.8 Å and a coordinate file of 2,000 atoms the
maximum number to be added is 27. New atoms will be automatically assigned a temperature
factor on the basis of the density height.
The string after chain is the chain identifier for new atoms.
All new atoms will have this chain identifier and be numbered sequentially.
The subkeyword after cutsigma can be either the number or auto. The number
is a MAPIN2 density cutoff. Atoms will be looked for in density above cutsigma
times r.m.s. density.
A value of 3 to 4 is typical. The statistically significant density threshold can be
defined automatically if auto is used. This can be used for MODE
update waters as
it prevents too many extra atoms being added. However it may not work satisfactorily
if the resolution is lower than 1.5 Å or the model is too far from being
finally refined.
|
REMOve |
Removal of atoms in MODE update allatoms/waters. The removal
of atoms influences the success of refinement to a much greater extent than addition
of new atoms and should certainly be used.
The number after atoms is the maximum number of atoms to reject at each cycle.
A value of about 25 to 100% of the number of atoms to be added
is recommended. The actual number will be defined by the program.
The number after cutsigma gives the MAPIN1 density cutoff. Atoms will be
considered for rejection only if they are located in density below cutsigma
times r.m.s. density. A value around 1 is recommended.
The number following the merge keyword is the shortest distance
between two atoms if they are to be merged.
Partially occupied atoms are not used for merging.
The keyword is optional.
Any pair closer than this will be inspected.
In a case of a water-water pair the atom with the higher temperature factor
will be rejected and the second assigned to the weighted average xyz
and 1/B parameters. If any water appears to be at the merging distance
to a non-water (protein or metal) atom, it will be removed.
A merging distance value of 0.6 Å is default for mode
update atoms
and the value of 2.2 is recommended for mode update waters where the
default is no merging.
keep zeroocc is an optional keyword.
Default is to remove atoms with zero occupancy from the PDB file.
|
REFIne |
This initialises the sphericity based real space refinement of individual
atoms. The keyword is optional in MODE update allatoms/waters.
The subkeyword can be either allatoms (all atoms will be refined - not
recommended unless the resolution is about 1.0 Å) or waters (strongly
recommended for analyse waters
mode, especially if the resolution is higher than 2.0 Å).
|
MIRBuild |
Obligatory keyword for MODE mirbuild.
The number after atoms indicates the approximate number of atoms to be placed
into the MIR(AS) MAPIN2 map. It should correspond to the total number of atoms expected
to be in the model. The number after models specifies how many different models
can be output. It may be 1, 2 or 3.
These different models are subsequently used for multiple refinement and weighted
averaging.
|
SHAKemodel |
Obligatory keyword for MODE shakemodel.
There are four optional subkeywords.
The number after bexcl is the highest temperature factor cutoff. Atoms with
higher temperature factors will be excluded from the PDB file.
The two numbers after breset define the low and high limits for truncation of
atomic temperature factors.
The number after randomise defines the r.m.s. uniform random shift in
Å to be applied to the coordinate set.
The three numbers after shift define the systematic shift along in
Å the crystallographic axes to be applied to the coordinate set.
|
LABIn |
Obligatory keyword for MODE reflaver. Input MTZ file
labels for structure factors from multiple refinements have to be given, e.g.
FP=FP SIGFP=SIGFP FC1=FC1 PHIC1=PHIC1 etc.
The maximum number of FCx/PHICx is 8. free is optional.
|
LABOut |
Obligatory keyword for MODE reflaver.
Output MTZ file labels for weighted average structure factors, phases and figures
of merit should be provided.
|
END |
Must be the last data card terminating input to ARP/wARP.
|
On-line help
The ARP/wARP input pre-processor gives warnings or error messages if something is wrong.
These should be carefully checked. It is also advisable to check ARP/wARP
input prior to submitting a long refinement job.
Here are a few examples of how the on-line commands can be used. To start just
type 'arp_waters' and then the keyword you are interested in.
arp_waters
END
Input must start with the keyword MODE
arp_waters
MODE
Keyword MODE must be followed by 1 field(s)
Expected format:
MODE update waters/allatoms
MODE mirbuild
MODE shakemodel light/allatoms
MODE reflaver
arp_waters
MODE UPDATE WATERS
Optional keywords:
CELL cell parameters
REFINE waters/allatoms
Required keywords:
SYMM spacegroup
RESOLUTION resmin resmax
FIND ATOMS number CHAIN string CUTSIGMA number/AUTO
and/or REMOVE ATOMS number CUTSIGMA number [MERGE number] [KEEP ZEROOCC]
END (must be the last keyword)
arp_waters
MODE UPDATE WATERS
CELL
An error message:
This Data Card in not understood
Keyword CELL must be followed by 6 field(s)
Expected format:
CELL a b c alpha beta gamma
arp_waters
MODE UPDATE WATERS
CELL 30 45 37 90 90 90 A
This Data Card in not understood
CELL 30 45 37 90 90 90 A
Cannot accept field shown by arrows:
CELL 30 45 37 90 90 90 ==>A<==
arp_waters
MODE UPDATE WATERS
CELL 30 45 37 90 90 90
SYMM 4
RESOLUTION 20 1.5
FIND ATOMS 10 CHAIN W CUTSIGMA 3.0
REMOVE ATOMS 10 CUTSIGMA 1.0
END
Asymmetric unit limits 1/1 1/2 1/1
Comments: Space group 4 P21
Comments: Cell parameters 30.000 45.000 47.000 90.000 90.000 90.000
Comments: Remove 10 old atoms if below 1.0 sigma in MAPIN1
Comments: Analyse waters only for removal
- WARNING - This is not a standard use of ARP
- use of MERGE data card is advisable
Comments: Look for 10 new atoms in MAPIN2
Above threshold of 3.0 sigma
- WARNING - This is not a standard use of ARP
- use of CUTSIGMA AUTO option is recommended
- assuming that MAPIN2 is Fo-Fc map
Comments: New atoms will not be put closer than 2.30 to existing atoms
Comments: New atoms will be selected if there is N or O exists within 3.30
Comments: New atoms will not be put closer than 2.30 to each other
Comments: New atoms will have B-factors assigned on the basis of MAPIN2
- density hight as expected for resolution range 1.50 20.00
- MAPIN2 is assumed to be Fo-Fc map in absolute scale
Comments: New atoms will have chain name W
- No real space refinement will be made
- WARNING - This is not a standard use of ARP
- real space refinement of waters is advisable
So ARP/wARP actually accepts the command file input and the program only gives
comments and warnings (if everything else is formally correct). It will also
make additional checks during the run.
Monitoring and Troubleshooting
Input Processing
ARP/wARP checks identity in the input cell parameters and those from the coordinate and
map file headers. ARP/wARP does not check whether the cell parameters are meaningful
at all, i.e. it will accept CELL 67.1 82.2 79.9 102.2 98.9 100.3 together with SYMM P212121.
ARP/wARP checks whether the orthogonalisation matrix derived from CELL is consistent with
the matrix written at the top of the coordinate file.
ARP/wARP will refuse to accept a negative value of the number of atoms to update but
does not check whether these numbers are not too high, i.e. are consistent with the formula given above.
ARP/wARP does not check whether the input MAPIN1 is indeed a
(3F_o-2F_c / 2mF_o-DF_c, alpha_c) map or if MAPIN2 is really a
(F_o-F_c / mF_o-DF_c, alpha_c) map.
ARP/wARP does not check the input coordinate file in terms of proper connectivity,
residue and atom names, etc.
Output
ARP/wARP outputs several useful quantities. These are: the number of atoms merged,
the number of atoms removed, the sphericity functions indicating whether atoms are
well shaped - a value of about 0.05 to 0.10 (the lower the better) is reasonable, the
result of improvement of the sphericity function if sphere-based real space
refinement is used, the statistically significant threshold in difference density
(if FIND cutsigma auto is provided) for addition of new atoms,
the number of atoms added.
The auto option provides an attempt to be objective in adding atoms.
The actual number of atoms to remove depends both on REMOVE cutsigma
value and atoms number). If the user during reshuffling the structure asked
for not enough removal, the result would be that not enough new atoms are found.
If the requested number for removal is too high (but assumed to satisfy the formula
given above) - more new atoms will be found.
A situation where each cycle ARP/wARP removes less than about 2-3 atoms (for
typical structure of 1,000 to 3,000 atoms) and finds the same number of new ones and
the R factor does not change indicates that convergence has been achieved.
There is no reason to run millions of cycles. Usually refinement essentially
converges after 10 to 20 cycles. However if the density is still getting better the
number of cycles can be increased to 50 or even 100.
Viewing ARP_WATERS Log Files
It is important to monitor the ARP/wARP output. In general look at log files.
All ARP log files can be formatted for viewing all kinds of interesting graphs with
CCP4 program xloggraph by running 'arp_waters_plots.sh log_file_name'.
Checking Convergence
Several parameters can be used as convergence criteria. The first criterion is
map quality. A map with coefficients (3F_o-2F_c/2mF_o-DF_c, alpha_c)
is calculated from the last ARP model.
The crystallographic R factor is a reasonable quantity to monitor.
What to do if the R factor stays at the values around 30%:
(Check with something like grep 'all_R' logs/1_arp_1.log)
If for example after 5 or 10 cycles, R dropped to 28-34% and stayed there for the
next 10 cycles without any tendency to drop further, you may be in trouble.
Try to change from Fast to Slow protocol or opposite, try to
introduce phase restraints, change advanced parameters, panic, cry, etc.! We are
working on more sensible suggestions all the time, so as a last resort contact us!
Your feedback is needed and appreciated!
Crashing Scripts
Usually CCP4 defines environment MANPATH as complementary to the existing MANPATH.
During execution of remote shells MANPATH does not exist, and this crashes remote
scripts! Copy the ccp4.setup file to your local directory, and simply remove the line
setenv MANPATH, and then set ccp4init to that file.
Please also check (and change if necessary) the line setenv CCP4_OPEN NEW to setenv CCP4_OPEN UNKNOWN.
Examples
A typical set of ARP/wARP commands for applications #1, 2, 5 and 6
(unrestrained or restrained refinement for MR, MIR, ab initio solutions or building of solvent structure)
could look something
like this:
arp_waters XYZIN input_coordinates MAPIN1 3Fo-2Fc_map_file \
MAPIN2 Fo-Fc_map_file XYZOUT output_coordinates << eof
MODE update allatoms/waters
[CELL cell parameters]
[REFINE waters/allatoms]
SYMM spacegroup
RESOLUTION resmin resmax
FIND atoms number chain string cutsigma number/auto
REMOVE atoms number cutsigma number [merge number] [keep zeroocc]
END
eof
Keywords FIND and REMOVE are half optional, by that we mean that
at least one of them must be given. Both MAPIN1
(3Fo-2Fc / 2mFo-DFc, alpha_c)
and MAPIN2 (Fo-Fc / mFo-DFc, ac) maps must be provided.
Another typical set of ARP/wARP commands, this time for
application #2 (filling the MIR(AS) map with a set of
pseudo protein atoms for further unrestrained refinement or multiple refinements):
arp_waters MAPIN2 Fo-Fc_map_file XYZOUT1/2/3 output_coordinates << eof
MODE mirbuild
CELL cell parameters
SYMM spacegroup
RESOLUTION resmin resmax
MIRBUILD atoms number models number
END
eof
Input MAPIN2 is the available starting map. Several models for multiple refinements are
output to XYZOUT1/XYZOUT2/XYZOUT3.
Yet another typical set of ARP/wARP commands, now for application #3 (obtaining different
independent models for multiple refinement):
arp_waters XYZIN input_file XYZOUT output_file << eof
MODE shakemodel light/allatoms
[CELL cell parameters]
SYMM spacegroup
SHAKEMODEL [ bexcl n1 ] [ breset n1 n2 ] [ randomise
x ] [ shift x y z ]
END
eof
And another typical set of ARP/wARP commands, again for application #3 (averaging of multiple
refinements of different independent models):
arp_waters HKLIN mul_ref_Fs
HKLOUT nice_output << eof
MODE reflaver
RESOLUTION resmin resmax
LABIN input labels for
FP SIGFP [FREE] FCx PHICx
LABOUT output labels for
FCAVER PHAVER FOMAVER
END
eof
ARP_WATERS and SHELXL
SHELXL is part of the SHELX-97 program package and should
be obtained directly from the author, George M. Sheldrick, Göttingen University
SHELX homepage.
The most common use of ARP/wARP with SHELXL shelx97 is for restrained
refinement with individual atomic anisotropic displacement parameters
(as provided by SHELXL) combined with updating of the solvent
structure by ARP/wARP . This application is limited to the fact that
individual atomic anisotropic displacement parameters can be refined
only if the resolution of the X-ray data is higher than 1.5 Å,
ideally approaching atomic resolution (1.2 Å).
There are currently no automated scripts for this application. An old-style
command shell script is given in the $CEXAM/unix/non-runnable directory
(arp_waters_shelx.com). The script includes iterative runs of the following programs:
- 1.
- SHELXL (SHELX-97) for restrained anisotropic refinement
Some recommendations for the shelx.ins file:
CGLS 2. Use of more cycles within SHELXL
lowers the ARP_WATERS contribution
CELL, LATT/SYMM and SHEL should be consistent with
cell, symm and resol in the script
WPDB -1
LIST 3
ISOR and CONN should include O1 > last - as the number
of waters changes with each cycle
See the SHELX-97 Manual for further details.
- 2.
- PREPFORM (ARP/wARP Suite) for conversion of SHELXL files
- 3.
- F2MTZ (CCP4) for conversion to the CCP4 MTZ format
Column label assignments should be edited if necessary
- 4.
- CAD (CCP4) for sorting the MTZ file
Column label assignments should be edited if necessary
- 5.
- FFT (CCP4) for map calculation
One map is calculated with coefficients 3Fo-2Fc,
another with Fo-Fc
Column label assignments should be edited if necessary
- 6.
- EXTEND (CCP4) for map extension
- 7.
- ARP_WATERS (ARP/wARP Suite) for solvent update
The maximum number of atoms to add and to remove
should not exceed the value of 0.08 X N/dmax3,
where N is the current number of atoms in the model and
dmax
is the high resolution limit.
- 8.
- PREPSHEL (ARP/wARP Suite) for back conversion to SHELXL format
When writing a shell script take care to define the following variables
at the top of the file:
name (root file name), last (starting file number),
cycles (number of refinement cycles), count, title, resol
(resolution limits), cell (cell parameters), grid (grid for
map calculation), xyzlim (boundaries for real space
asymmetric unit for ARP_WATERS), symm (space group number) and
sfsg (space group for map calculation)
Simple toxd example script found in $CEXAM/unix/runnable/
arp_waters.exam
(Example of finding waters.)
Comprehensive example scripts found in $CEXAM/unix/non-runnable/
arp_waters_refmac.com
arp_waters_sfall.com
arp_waters_shelx.com
SEE ALSO
protin
refmac
fft
mapmask