rsps
[Keyworded input]
RSPS is a command-driven program intended to help protein crystallographers solve their heavy atom derivatives. The program can also be used as an interactive tool to examine the fit of potential heavy atom sites to the difference Patterson map. The program can handle all acentric spacegroups. RSPS version 4.0 may also be used to locate molecules with NCS. The goal of RSPS is not to generate a complete solution to the heavy atom difference Patterson, but rather to find enough sites to allow initial phases to be calculated for difference Fourier analysis. The program as such will not provide absolute answers and it is therefore very useful if you have at least a rudimentary understanding of the Patterson function, and of how to solve it, before you start using RSPS. See for example Stout & Jensen (1989), chapter 12; Blundell & Johnson (1976), chapter 11.
RSPS is a grid search program that provides search options as well as options for examining potential solutions. All options operate in real and vector space. Searches can be performed to locate either heavy atom positions, or, under certain conditions, to locate the position of molecules with internal symmetry. Searches are carried out by assigning trial positions on a grid covering the asymmetric unit of the crystal, and then computing a score for each trial position, based on the Patterson densities at the positions corresponding to the predicted vectors for each position. From the symmetry operators (crystallographic and/or non-crystallographic) all unique transformations that map a point in real (crystal) space to a point in vector (Patterson) space are generated. In other words, these transformations map a point in real space to the Patterson vectors associated with that point.
Search options are divided into six groups depending on the vector set to be used in the search. The available sets are of two main types: atom vector sets that are used to search for the position of heavy atoms, and molecule vector sets that are used to find the position of molecules with NCS when at least one NCS axis is closely parallel to a crystallographic symmetry axis. The two main types of vector sets each have three subcategories termed single, more, and translate. With single vector sets (selected by VECTORSET SINGLE), the position of one site at a time is determined by considering vectors between symmetry related positions. This type of search is referred to as a single site search. When only SGS (spacegroup symmetry) is used in a single site search, this corresponds to using only Harker vectors, i.e. vectors between SGS-related positions. When NCS is applied, cross vectors that may be termed pseudo-Harker vectors will be generated between NCS-related positions. The combination of SGS and NCS will in addition generate cross vectors between positions on different copies of the NCS protein molecule. Once at least one atom position (or NCS molecule) has been found, it may be fixed and used to search for further sites by considering cross vectors from trial sites to the fixed site(s) (using VECTORSET MORE). VECTORSET TRANSLATE is used to search simultaneously for two or more atoms provided that their inter-atomic vectors are known. Both SGS and NCS may be independently switched on and off, giving a very flexible means of controlling the type of vectors to be used in the search.
With the search options, a volume of the unit cell (usually one asymmetric unit) is scanned, and test points assigned on a grid within this volume. In single and more site searches, the scan parameter is the coordinates of a heavy atom position whereas in translate mode the scan parameter is the translation of the rigid fragment. For each test point, the values of the Patterson function at the predicted Patterson vector positions are collected and combined in some way (presently sum, product and harmonic mean functions are available). All or only the minimum N peaks may be used in the scoring function. The value of the combining function (the "score") is stored on a CCP4 format map (the "scoremap", defined with the SCORFILE command). A rejection level for peaks may be specified (REJECT command) together with a limit for the maximum number of peaks allowed to be less than the rejection level (LOW command). When this limit is passed, further work on that test position is aborted and it is given a score of zero. In this way computation for obviously wrong solutions may be aborted at an early stage which may considerably reduce the time needed for the search. The combined use of this rejection scheme and judicious use of the minimum function leads to a flexible set of scoring options well suited to accommodate the varying degree of "vectorness" of a protein difference Patterson function.
The map resulting from a search will contain peaks at the positions of possible heavy atom (or molecule) sites and at positions related to these by a shift in origin or by inversion. The scoremap may be picked using the PICK SCOREMAP command to generate a list of potential heavy atom sites. Coordinates of (potential) heavy atom sites may also be read from an external file using the READ command. Coordinates are stored in the main coordinate array (see below). Picking or reading new coordinates will overwrite any previously stored coordinates. Coordinates for up to 800 positions may be stored. Each position is given position number which is just a sequential numbering of the stored coordinates, and a site number that groups together positions that generate the same set of Harker vectors. The site number thus groups positions related by inversion and/or origin shift (or spacegroup symmetry), i.e. different representations of the same solution. The assignment of site numbers may be quite slow for high-symmetry spacegroups. Also note that due to rounding errors positions with the same site number will not necessarily have exactly the same score, although in most cases they will. The position and site number may be used to reference position for use with the VLIST and FIXXYZ commands.
Given a list of potential heavy atom positions, the GETSETS option may be used to search for sets of positions. This is done by looking at the cross vectors between all pairs of atoms (and their symmetry related equivalents) in the list. If a pair of positions pass the rejection criteria as specified by REJECT and LOW they are flagged as connected, otherwise they are flagged as unconnected. The program then finds all sets where all pairs of positions are flagged as connected. The output from GETSETS consists of, for each set, the coordinates of the positions in the set and a score table giving the score for the vectors generated by these positions. The TABLE command may be used to write out the score table for any currently stored set, or for a user defined set. The command LIST SETS will give a summary listing of all stored sets.
A single site search in a polar spacegroup will result in a list of potential positions where one coordinate has not been determined (this will have been arbitrarily set to zero by the program). The POLARSCAN option may then be used to try and relate different solutions from the single site search to the same origin by fixing one position and translating the others, one at a time, along the polar axis. Scores based on all the cross vectors between the fixed and the translated position (and their symmetry-related equivalents) are computed as a function of displacement along the polar axis and stored. This type of search does not produce a scoremap, instead the coordinates are put directly in the main positions storage area where they will overwrite any previously stored coordinates.
The VLIST option is used to examine potential solutions. If the Patterson map has been picked (PICK PATTERSON) the list of stored peaks is searched to find peaks close to the predicted vectors. The largest peak within 2.5 grid divisions from a predicted vector is listed together with the distance (in Angstrom) from the predicted vector. Note that the Patterson peaks as such are never used in a search.
There are two areas for storing coordinates: the main coordinate array and the FIXXYZ coordinate array. Coordinates from picking the scoremap, as well as coordinates read from a file using the READ command are stored in the main coordinate array. When new positions are stored here these will overwrite any previously stored coordinates. Coordinates may be inserted in the FIXXYZ coordinate array by copying from the main coordinate array or by explicitly giving the fractional coordinates. Coordinates in the FIXXYZ area are stored until explicitly deleted (DELETE FIXXYZ). The number of positions that may be stored in the different coordinate storage areas are defined by the MXPICK and MXIPSN parameters. These parameters are set in the include file rspsdim4.inc. Values of the parameters in the distributed program are listed under "Space limitations" below.
Commands are listed below in bold face, arguments in italics, optional subcommands and arguments are given in parenthesis (), and defaults in square brackets []. Angular brackets < > are used to enclose a list of alternative subcommands or arguments, separated by |. Curly brackets {} indicate that the enclosed subcommand may be repeated, a suffixed number sometimes indicating the maximum number of repetitions. Several commands may be entered on a single line separated by a semicolon (";") (exception: the macro command "@ filename" must be on a line by itself) and may be abbreviated. A command line may be continued over several input lines by appending a hyphen ("-") or ampersand ("&") at the end. Arguments and subcommands should be separated by tabs or spaces. Both upper case and lower case is OK. Entering a command without its argument(s) echoes back the current value of the argument(s). Anything following an exclamation mark ("!") or a hash sign ("#") is treated as a comment and ignored, as are blank lines.
The available commands are:
Alphabetic listing:
@ filename, add/subtract, bump, cell, delete, exit, fixxyz, getsets, help, list, low, ncsrot, ncsymm, patfile, pick, polarscan, print, quit, read, reject, rotate, scan, score, scorfile, sgsymm, spacegroup, status, table, tolerance, vectorset, vlist, weight, write
n = 1 w(i) = 1.0 n = 2 w(i) = mult(i) ; the multiplicity of peak i.The Patterson peaks are multiplied by w(i) before forming the score so that using weight 2 with the sum function results in a correlation function between the observed and predicted (but unscaled) Patterson vectors.
code = 1, orthogonal x y z along a,c*xa,c* (Brookhaven, default) = 2 b,a*xb,a* = 3 c,b*xc,b* = 4 a+b,c*x(a+b),c* = 5 a*,cxa*,c (Rollett)To initialize the NCS arrays use delete ncs. The cell dimensions must be known prior to reading non-crystallographic symmetry. It is useful to put the NCS definition in a macro.
n = 0 minimum output n = 1 a little more output (default) n = 2 prints out symmetry operations and score statistics for each section during a scan. n = 3 prints out real to vector space transformations
The volume of the Patterson function needed is listed below for each spacegroup. In all cases this volume is the smallest possible box that includes at least one full asymmetric unit. The limits correspond to the default limits used in the CCP4 FFT programs. If a map covering a full unit cell is provided, the program will recognize this and skip the reduction of peaks to the asymmetric unit, thus saving computing time at the cost of space.
Patterson symmetry Spacegroups Limits along x, y, z ================================================================================ P-1 P1 0 1 0 1/2 0 1 P2/m unique axis b P2, P21 0 1/2 0 1/2 0 1 C2/m C2 0 1/2 0 1/4 0 1 Pmmm P222, P2221, P21212, 0 1/2 0 1/2 0 1/2 P212121 Cmmm C2221, C222 0 1/2 0 1/4 0 1/2 Fmmm F222 0 1/4 0 1/4 0 1/2 Immm I222, I212121 0 1/2 0 1/4 0 1/2 P4/m P4, P41, P42, P43 0 1/2 0 1/2 0 1/2 I4/m I4, I41 0 1/2 0 1/2 0 1/4 P4/mmm P422, P4212, P4122, 0 1/2 0 1/2 0 1/2 P41212, P4222, P42212, P4322, P43212 I4/mmm I422, I4122 0 1/2 0 1/2 0 1/4 P-3 P3, P31, P32 0 2/3 0 2/3 0 1/2 R-3 hexagonal axes R3 hexagonal axes 0 2/3 0 2/3 0 1/6 P-31m P312, P3112, P3212 0 2/3 0 1/2 0 1/2 P-3m1 P321, P3121, P3221 0 2/3 0 1/3 0 1 R-3m hexagonal axes R32 hexagonal axes 0 2/3 0 2/3 0 1/6 P6/m P6, P61, P65, P62, P64, 0 2/3 0 1/2 0 1/2 P63 P6/mmm P622, P6122, P6522, 0 2/3 0 1/3 0 1/2 P6222, P6422, P6322 Pm-3 P23, P213 0 1/2 0 1/2 0 1/2 Fm-3 F23 0 1/2 0 1/2 0 1/4 Im-3 I23, I213 0 1/2 0 1/2 0 1/2 Pm-3m P432, P4232, P4332, 0 1/2 0 1/2 0 1/2 P4132 Fm-3m F432, F4132 0 1/2 0 1/4 0 1/4 Im-3m I432, I4132 0 1/2 0 1/2 0 1/4 ================================================================================
The table below is a summary of the search options available using the scan command. In addition, the special search options polarscan and getsets are available.
VECTORSET | SGSYMM | NCSYMM | Description |
---|---|---|---|
SINGLE ATOMS | ON | ON/OFF | Single-site search using vectors between symmetry-related positions. When only SGS is used, this will be a search using Harker vectors only. NCS symmetry in addition generates pseudo-Harker cross vectors between NCS-related positions, and cross vectors between different NCS copies of the protein molecule. |
OFF | ON using only rotational part | Locate positions related by NCS from the translation-independent cross vectors (pseudo-Harker vectors) between NCS-related positions. The positions will be displaced from their true position by a vector t which may be found in a TRANSLATE ATOMS scan. | |
MORE ATOMS | ON/OFF | ON/OFF | Given one or more fixed positions, find additional sites by looking at cross-vectors to the fixed sites. Harker vectors for potential solutions may then be examined by using the VLIST command. |
TRANSLATE ATOMS | ON/OFF | ON/OFF | Translate two or more positions as a rigid body. These positions may come from
|
SINGLE MOLECULES | ON | ON using only rotational part | Find location of symmetric molecule using the structure-invariant subset of Harker vectors |
MORE MOLECULES | ON | ON using only rotational part | Given the position of one or more molecules with NCS, find the position of additional molecules using the structure-invariant subset of cross vectors |
TRANSLATE MOLECULES | ON | ON using only rotational part | Translate two or more NCS molecules with a fixed separation as a rigid body |
1) Command procedure to run single atom search, then more atom search to top position from single atom search:
# Command procedure file starts here #!/bin/csh -f # # # rsps04_2 << eof-rsps >& rsps_fre1.log # RSPS example script for flavin oxidase/reductase # Au anomalous (peak) data to 4.0 A spacegroup P212121 patfile auanopatt.map reset origin 8 0 scorfile /nfs/scr_slu5/stefan/rsps.map pick patterson 200 # Single site scan of asymmetric unit # Only Harker vectors will be considered scan pick scoremap vlist site 1 4 write positions fre_single.pdb # Fix top site and look for more atoms # Now only cross vectors will be considered fix site 1 vectorset more atoms scan pick scoremap vlist site 1 20 write positions fre_more.pdb # Check Harker vectors for sites found in more atoms scan vec si at vlist site 1 20 exit eof-rsps # # The command procedure file ends here
2) One could alternatively make the following macros:
fre: # RSPS example script for flavin oxidase/reductase # Au anomalous (peak) data to 4.0 A # spacegroup and file definitions spacegroup P212121 patfile auanopatt.map reset origin 8 0 scorfile /nfs/scr_slu5/stefan/rsps.map pick patterson 200 sscan: # Single atom scan of asymmetric unit scan pick scoremap vlist site 1 4 write positions fre_single.pdb mscan: # Fix top site and look for more atoms # Now only cross vectors will be considered fix site 1 vectorset more atoms scan pick scoremap vlist site 1 20 write positions fre_more.pdb hvlist: # List Harker vectors for top sites vectorset single atoms vlist site 1 20
and then run the command procedure
# Command procedure file starts here # This procedure for flavin oxidase/reductase Au anomalous # # rsps04_2 << eof-rsps @ fre @ sscan @ mscan @ hvlist exit eof-rsps # # # The command procedure file ends here
In reality, one would run RSPS interactively rather than as a batch job, starting with a vectorset single scan as above, and then keep adding more and more atoms by repeatedly fixing sensible looking sites and carrying out more atoms scans. As more atoms are added to the fixxyz list, more and more vectors will be considered in the search, and rejection criteria may have to be relaxed in order to find additional sites. But remember that the goal is not to find all of the sites, just enough to start phasing.
3) Command procedure to run single atom and polarscan search in P21
# Command procedure file starts here # This procedure for Mutase spacegroup P21 # rsps04_2 << eof-rsps spacegroup P21 patfile mmcm_hgac.pat reset origin 8 0 scorfile /nfs/scr_slu5/stefan/rsps.map title "HgAc - M483 diff patt 15 - 6 A" # Run single site search - this will be over one section perpendicular # to the y axis for P21 second setting scan pick scoremap write positions harker.pdb # The top position is fixed and all positions translated along the # polar axis (b). The vectorset is automatically set to MORE ATOMS when # issuing the POLARSCAN command. The positions found are put # directly into the main coordinate storage area and so no # PICK SCOREMAP command is needed. fix pos 1 ; polarscan pos 1 50 write positions polarscan.pdb exit eof-rsps # # The command procedure file ends here
4) Often the direction of a non-crystallographic symmetry (NCS) axis may be found from e.g. a rotation function whereas the position of the NCS axis is harder to find. The vectors between heavy atom positions related by NCS are independent of the position of the NCS axis, and thus these vectors may be used to find such heavy atom positions. To do this in RSPS, the spacegroup symmetry is switched off and a single atom scan carried out using only the NCS. The positions related by the NCS may then be located in the cell by translating them as a rigid fragment and considering the vectors to positions in SGS related fragments. The following command procedure is an example of such a search for spinach Rubisco in spacegroup C2221 with a 4-fold NCS axis almost parallel to the c axis.
# Command procedure file starts here # rsps04_2 << eof-rsps CELL 157.20 157.20 201.30 90.00 90.00 90.00 SPACEGROUP C2221 PATFILE merc.pat RESET ORIGIN 8 0 RESET 0.5 0.0 0.5 6 0 # Define non-crystallographic symmetry. # Translations set to 0,0,0 # NCSROT POLAR -1.8 0.0 90.0 0 0 0 NCSROT POLAR -1.8 0.0 180.0 0 0 0 NCSROT POLAR -1.8 0.0 270.0 0 0 0 # # Set scoring parameters # LOW is set very high to allow any number of low peaks # (alternatively REJECT could have been set at zero). # LOW 100 WEIGHT 2 # # First select single atom vectorset and switch spacegroup symmetry off # and non-crystallographic symmetry on; scan asymmetric unit. # VECTORSET SINGLE ATOMS SGSYMM OFF NCSYMM ON SCORFILE rsps_ncs.map ; SCAN AU PICK SCOREMAP # # Select top position and apply the four-fold rotation; # store the resulting four positions in the FIXXYZ area. # ROTATE POS 1 NCS 1 4 FIX 1 # # Now select the translate atoms vectorset, and switch non-crystallographic # symmetry off and spacegroup symmetry on. The four positions in the FIXXYZ # area will be translated through the unit cell as a rigid body, # and vectors (Harker and cross) to positions in symmetry related # fragments used to form the score. High values in the resulting # scoremap will indicate possible translation vectors to be added to # the FIXXYZ positions to give the correct coordinates of the # four-fold related sites. # VECTORSET TRANSLATE ATOMS SGSYMM ON NCSYMM OFF SCORFILE rsps_tran.map ; SCAN LIMITS 0 1 0 1 0 1 PICK SCOREMAP QUIT eof-rsps # #Command procedure file ends here
5) If a cross vector u can be identified in the Patterson function a "two-site" search may be carried out. One position is then set to x (e.g. 0,0,0) and another is placed at x + u. The two positions are stored in the fixxyz area, and a vectorset translate scan carried out to find the translation that correctly locates the two sites in the unit cell. Note that the search is over the entire unit cell since the atoms related by the selected cross vectors could be anywhere in the cell (by selecting a Patterson peak from the peak list we have arbitrarily chosen coordinates for the cross vector reduced to the asymmetric unit of the Patterson.)
# Command procedure file starts here # rsps04_2 << eof-rsps CELL 157.20 157.20 201.30 90.00 90.00 90.00 SPACEGROUP C2221 PATFILE merc.pat RESET ORIGIN 8. 0. RESET 0.5 0 0.5 6. 0. # # A cross vector at 0.3041 0.0541 0.000 was identified in the # Patterson function. Thus one site is fixed at 0,0,0 and another at # 0.3041 0.0541 0.000 (alternatively if the Patterson had been picked # the second position could have been fixed using the FIXXYZ PEAK # command). # FIXXYZ 0. 0. 0. FIXXYZ 0.3041 0.0541 0.000 # # The two sites are translated as a rigid fragment to # find their location in the unit cell # VECTORSET TRANSLATE ATOMS SGSYMM ON NCSYMM OFF SCORFILE rsps_tran.map ; SCAN LIMITS 0 1 0 1 0 1 PICK SCOREMAP EXIT eof-rsps # #Command procedure file ends here
Output generated by pick patterson (flavin reductase Au anomalous Patterson, spacegroup P212121). Only the top 10 peaks are shown.
PICK >> The 200 highest peaks above 0.0 are listed in descending order Peak Fractional coordinates Angstrom coordinates Grid coordinates Value S/N ---- ------------------------ ------------------------ ---------------------- --------- ------- 1 0.0000 0.1532 0.2009 0.00 15.26 43.43 0 11 32 1.41 6.0 2 H 0.5000 0.4499 0.1256 25.78 44.81 27.16 18 32 20 1.33 5.6 3 0.1704 0.0000 0.0203 8.78 0.00 4.39 6 0 3 1.13 4.8 4 H 0.2882 0.5000 0.1675 14.86 49.80 36.22 10 36 27 1.04 4.4 5 0.1395 0.0000 0.0303 7.19 0.00 6.56 5 0 5 1.02 4.3 6 0.1922 0.0300 0.0000 9.91 2.99 0.00 7 2 0 1.02 4.3 7 H 0.2632 0.5000 0.4225 13.57 49.80 91.33 9 36 68 1.01 4.3 8 0.0548 0.0990 0.0000 2.82 9.86 0.00 2 7 0 0.99 4.2 9 H 0.5000 0.4175 0.3705 25.78 41.58 80.09 18 30 59 0.94 4.0 10 H 0.5000 0.3715 0.1043 25.78 37.00 22.55 18 27 17 0.90 3.8
An "H" after the peak number in the "Peak" column indicates that the peak is on a Harker section.
Output generated by pick scoremap on a single site scoremap:
PICK >> 50 peaks found; these are listed in descending order PosnN Fractional coordinates Angstrom coordinates Score Site ----- ------------------------ ------------------------ --------- ---- 1 0.6322 0.5519 0.0385 32.60 54.97 8.31 3.39 1 2 0.6322 0.9481 0.0385 32.60 94.43 8.31 3.39 1 3 0.6322 0.0519 0.0385 32.60 5.17 8.31 3.39 1 4 0.6322 0.4481 0.0385 32.60 44.63 8.31 3.39 1 5 0.8678 0.5519 0.0385 44.74 54.97 8.31 3.39 1 6 0.8678 0.9481 0.0385 44.74 94.43 8.31 3.39 1 7 0.8678 0.0519 0.0385 44.74 5.17 8.31 3.39 1 8 0.8678 0.4481 0.0385 44.74 44.63 8.31 3.39 1 9 0.1322 0.5519 0.0385 6.82 54.97 8.31 3.39 1 10 0.1322 0.9481 0.0385 6.82 94.43 8.31 3.39 1 11 0.1322 0.0519 0.0385 6.82 5.17 8.31 3.39 1 12 0.1322 0.4481 0.0385 6.82 44.63 8.31 3.39 1 13 0.3678 0.5519 0.0385 18.96 54.97 8.31 3.39 1 14 0.3678 0.9481 0.0385 18.96 94.43 8.31 3.39 1 15 0.3678 0.0519 0.0385 18.96 5.17 8.31 3.39 1 16 0.3678 0.4481 0.0385 18.96 44.63 8.31 3.39 1 : :
Only the 16 first positions (8 possible origins in P212121 x 2 hands), representing one unique site, are shown here.
Output generated by vlist site 1 with positions above stored:
**************************************************************************** Harker vectors for a heavy atom position at 0.6322 0.5519 0.0385: Vec U V W Rho Multiplicity Peak Distance --- ------ ------ ------ ------- ------------ ---- -------- 1 0.2355 0.1038 0.5000 0.65 1 31 0.22 2 0.2645 0.5000 0.4231 0.98 1 7 0.14 3 0.5000 0.3962 0.0769 0.77 1 12 0.13 Score = 3.39 with 0 low peaks Rmsd peak positions = 0.1694 Rmsd peak heights = 0.6132 Matching index = 0.8297 ****************************************************************************
The fractional coordinates along the cell axes of the three Harker vectors are listed together with the value of
the Patterson function and the relative multiplicity of each vector. The "Peak" column shows the number
of the highest stored Patterson peak within 2.5 grid divisions from the calculated position and "Distance"
is the actual distance (in Angstrom). (If the Patterson map hasn't been picked these columns are absent). The
"matching index" is calculated as
M = ( 1 + ihit ) / ( 1 + rmspsw*rmspos )( 1 + rmshtw*rmshgt )( 1 + ntrans )
where
The idea for the matching index was stolen from G. Kleywegt's LSQMAN program. The matching index assumes values between 0 and 1, where "0" indicates a "perfect mis-match" and "1" a perfect match. Note that because the matching index is based on the match between predicted vectors, and peaks on the Patterson peak list, the value may depend on the number of peaks on the list.
PDB file generated by the write positions command:
HEADER RSPS MORE ATOMS SCAN 7/12/99 > Real Space Patterson Search Map REMARK File written by RSPS on 7/12/99 CRYST 51.560 99.600 216.166 90.00 90.00 90.00 SCALE1 0.01939 0.00000 0.00000 SCALE2 0.00000 0.01004 0.00000 SCALE3 0.00000 0.00000 0.00463 REMARK POSNN TYPE SITE X Y Z SCORE ATOM 1 HG MTL 1 7.897 11.561 46.892 2.11 20.00 ATOM 2 HG MTL 2 5.729 9.683 33.776 1.62 20.00 ATOM 3 HG MTL 3 17.187 47.033 5.404 1.51 20.00 ATOM 4 HG MTL 4 22.916 70.550 18.915 1.42 20.00 ATOM 5 HG MTL 5 0.000 13.833 6.755 1.37 20.00 ATOM 6 HG MTL 5 51.560 13.833 6.755 1.37 20.00 ATOM 7 HG MTL 6 50.128 94.067 25.670 1.05 20.00 END
Sample scoretable. A scoretable is generated as part of the getsets output, or may be explicitly generated using the table command, as in the example below.
******************************************************************************** Set number 0; 3 members , overall score 2.35 PosnN Fractional coordinates Angstrom coordinates ----- ------------------------ ------------------------ 1 0.6322 0.5519 0.0385 32.60 54.97 8.31 2 0.1567 0.1008 0.1698 8.08 10.04 36.70 3 0.1532 0.1111 0.2188 7.90 11.07 47.29 Score table ----------- PosnN 1 2 3 <Score> 1 3.39 2.84 1.80 2.61 2 2.92 2.18 2.62 3 1.37 1.82 Number of vectors = 33 (all) 9 (Harker) 24 (Cross) Number of low vectors = 0 (all) 0 (Harker) 0 (Cross) Score = 2.35 (all) 2.56 (Harker) 2.27 (Cross) Peak hit frequency = 0.9697 (all) 0.8889 (Harker) 1.0000 (Cross) Rmsd peak positions = 1.5657 (all) 1.3821 (Harker) 1.6223 (Cross) Rmsd peak heights = 0.9805 (all) 1.0586 (Harker) 0.9496 (Cross) Matching index = 0.3924 ********************************************************************************
The score table gives the scores for the Harker (and pseudo-Harker in the case of NCS) vectors for each position along the diagonal, the off-diagonal entries are pairwise cross-vector scores. If Patterson peaks have been picked (as in this example), details of the fit between predicted and observed vectors are also given. This is often a useful guide to the correctness of a solution. In particular, correct solutions tend to have a rather high peak hit frequency, in contrast to incorrect solutions. The matching index, in the author's limited experience, is usually above 0.3 for correct solutions.
RSPS is written in Fortran 77 with a few commonly accepted extensions that are detailed below. The program has been implemented and successfully run on Digital/VAX systems as well as a host of Unix machines such as the Alliant FX 2800 and the SGI 4D series. RSPS is designed so that it can easily be run interactively, although, depending on the symmetry, the size of the cell and the computer, the response may be far from interactive. Thus, a vector search for heavy atom positions would normally be run as a batch job, whereas checking of results can in most cases be done interactively.
The structure of the RSPS program is highly modular to allow for flexibility in debugging and future development. At the lowest level are a number of library routines that handle matrix and vector algebraic operations as well as elementary operations on positions and peaks in direct and vector space respectively. At the heart of the program is the command interpreter which is based on the CCP4 parser and terminal i/o routines from the library package FORLIB ((C) Per Kraulis 1990). Higher level routines carry out the various search and checking options available in RSPS. Definitions of default values for parameters, dimensioning statements, and common block statements have been collected in a number of include files and may thus easily be modified.
The following include files are used in the RSPS program:
rspsctl4.inc RSPS control variables rspsdef4.inc RSPS definition of defaults rspsdim4.inc RSPS dimensioning parameters rspsfil4.inc RSPS file definitions rspsgrd4.inc RSPS grid information rspsmap4.inc RSPS map information rspspos4.inc RSPS position information rspssym4.inc RSPS symmetry information
Dimensioning parameters are defined in the file 'rspsdim4.inc'. In the distributed version this gives the following space limitations:
Maximum number of spacegroup symmetry operations | 48 |
Maximum number of pseudo symmetry operations | 60 |
Maximum number of points in Patterson map | 800000 |
Maximum number of points along fast and slow axis in search map | 600 |
Maximum number of VLIST positions | 30 |
Maximum number of FIXXYZ positions | 30 |
Maximum number of input positions to GETSETS | 800 |
Maximum number of PICKable peaks | 800 |
Lower case code Longwords IMPLICIT NONE statement INCLUDE statement END DO statement CARRIAGECONTROL = 'LIST' element in open statement control list (s/r rswpdb) The $ format descriptor is used in the FORLIB library package.
None
The Patterson grid should be chosen as ca 1/3 of the resolution.
It is probably a good idea to initially run the program with rather strict rejection criteria (reject 1. ; low 0; this is the default) to see if anything shows up. If nothing is then found re-run the search with looser rejection criteria until a sufficient number of possible solutions is found. On the other hand, if too many solutions are found then re-run the search using a higher rejection level. It is worthwhile to use the vlist option to examine the Patterson peaks predicted by potential solutions. Note that by default special positions are not considered in a single atoms search, but may be included by specifiying a negative bump argument.
It is recommended to always use a search scheme that maximizes the number of vectors used for each trial position. Thus, if non-crystallographic symmetry is present, use it. If a cross vector can be identified, do a two-site search rather than a single-site search.
Positions from a cross scan should be checked for Harker vectors using the vlist option before they are added to the list of fixed positions.
In polar spacegroups where one coordinate is indeterminate from the Harker section it is only necessary to perform the search scan over one section. This means that positions that only differ in the polar coordinate will be unresolved. As long as at least one position can be found this should not be a big problem however, since that position can then be used to find further sites by doing a cross-vector (vectorset more atoms) or polarscan search.
Tolerance > 0 gives an increased number of junk solutions.
Densities around the origin, and any NCS translational peaks, should always be reset or a lot of junk solutions will appear.
If vectors fall between grid points the nearest grid point will be used. If this is a serious problem the scoremap sectioning or the scan grid may be adjusted accordingly.
In vectorset translate scans different representations of the same solution may not necessarily have the same site number.
In a difference Patterson map between two heavy atom derivatives the cross-vectors between sites in different derivatives will appear as negative densities whereas the Harker peaks and the cross-vectors between sites within each derivative will be positive. This implies that a derivative could be solved by looking at cross-vectors between sites in this derivative and known sites in the second derivative. To do this apply a scale factor of -1 to the map and then perform a more atoms scan.
When running RSPS interactively use a wide screen (132 characters) to avoid scrambled output.
The most important change to the keyword input in RSPS 4.2 is that the old MODE keyword is no longer used; it has been replaced with the new vectorset command. The functionality of the old command can be recovered as follows:
Old command: | New command: |
---|---|
MODE HARKER | vectorset single atoms |
MODE CROSS | vectorset more atoms |
MODE TRANSLATE | vectorset translate atoms |
The old WRITEX command has also been changed slightly; it is now called write and has a different syntax (see main documentation).
The old SAVE command is obsolete.
Stefan Knight.