SURFACE (CCP4: Supported Program)

NAME

surface - surface accessibility program and for preparing input file to program volume.

SYNOPSIS

surface xyzin1 foo_in1.pdb [xyzin2 foo_in2.pdb] xyzout foo_out3.rad
[Keyworded input]

DESCRIPTION

For each atom in a list, the accessibility program is designed to find the surface area in square Angstroms that is accessible to a probe sphere of a radius specified by the user. The program requires atom identification data and crystallographic coordinates from an input file. This will normally be in Brookhaven PDB format, but the program will also accept the output file from the Konnert-Hendrickson refinement program, or a file produced internally during operation of this program (see FORMAT). The program can be easily modified to accept other formats.

Atom types are identified through residue and atom names. Van der Waals radii are assigned to each atom on the basis of atom type. The values are listed in the subroutines RADASNCHC and RADASNRICH (see VDWR). The data statements in that program can easily be changed by the user if a different set of standard values is wanted. Any atoms that cannot be identified by this subroutine are assigned a default radius of 1.80 Angstroms. The radius of the spherical probe may be assigned any value in the range 0.0 to 9.9 (see PROBE). Adjustments to the source program must be made to accommodate values outside of this range. A water molecule is commonly assumed to have a radius of 1.40 Angstroms. A flag system can be used to include or exclude atoms from the accessibility calculation.

The output of the program is a file containing all of the input data for each atom, the assigned Van der Waals radii, certain internal flags indicating the atoms included in the calculation, the accessible area, the contact area, and the fractional area (not yet implemented). The accessible area is the area in square Angstroms units of the locus of the center of the probe. The contact area is the area in square angstroms on the Van der Waals surface of an atom that can be contacted by a sphere of the given probe radius. The algorithm of Lee and Richards (1971) is used to calculate the accessible area.

[NOTE: The initial part of this program may be used to prepare an output file to be used as input into the VOLUME program.]

INPUT AND OUTPUT FILES

Input files

XYZIN1, XYZIN2
Input files of atom and coordinate data. These would normally be in PDB format ('PDB'), although 'WAH', 'RAD' and 'CHA' are also possible (see FORMAT). Normally only XYZIN1 is used; use the NFILES keyword to specify two input files.

Output file

XYZOUT
This is a formatted ASCII file (1X,I2,1X,I5,1X,I2,1X,I2,1X,2A4,1X,I3,3F8.3,2F5.2,2F6.1,1X,F4.2) with the following columns:
     KEY(I)     = flag for accessibility calculation.
     I          = integer counter
     NUMCHN(I)  = chain number if more than one peptide.
     NUMFIL(I)  = file number if more than one protein is listed.
     ATM(I)     = atom designation. Up to 4 characters (uses PDB convention).
     RES3(I)    = residue designation (three letter code).
     SEQNUM(I)  = sequence number of residue.
     X(I)       = X coordinate of atom.
     Y(I)       = Y coordinate of atom.
     Z(I)       = Z coordinate of atom.
     RVDW(I)    = Van der Waals radius of atom.
     RCOV(I)    = covalent radius of atom.
     AAREA      = accessible area of atom.
     CAREA      = contact area of atom.
     FRCACC     = fractional accessibility of atom (not yet implemented).
This is a 'RAD' format file that can be used as input to this program. It can also be used as input to the VOLUME program.

KEYWORDED INPUT

Available keywords are:

ALLATM, ATOM, CALCULATE, CHAIN, DONE, FILE, FORMAT, INCLUDE, NEXT, NFILES, OMIT, PRESET, PROBE, RERUN, RESET, RESIDUE, RUN, SERIAL, SKIP, STOP, SUBSET, VDWR, ZONE, ZSTEP.
The order of the keywords is important. It is advisable to read the whole of this document first.

Except within the first set, keywords must be entered in the order described in this section. The first set of keywords is:

NFILES <nfiles>

Number of input files (default=1, maximum=2).

FORMAT <file> <format>

Specify format of input file number <file>. Possible values of <format> are PDB, WAH, CHA or RAD. The default is for all input files to have the Brookhaven PDB format, <format> = PDB.

PROBE <probe>

<probe> between 0.0 and 9.9 (default 1.40). The PROBE RADIUS is the radius of the sphere for which the Van der Waals surface of each atom flagged 0 will be tested. It is ordinarily assumed that you will be testing for accessibility to water. We use a standard radius of 1.40 Angstroms for water so this is the default value. If you want to change the probe radius because you are testing accessibility to something other than water or have a different value for water this is where to make the change.

ZSTEP <zstep>

<zstep> between 0.1 and 1.0 (default 0.25). The ZSTEP value will determine the accuracy of the accessibility calculation. The program finds a given atom for which the accessibility is to be calculated. Then it finds all the neighboring atoms (rejecting or including according to FLAG value), and sequentially slices through the effective spheres of the set of atoms along the z axis. The circle of intersection of the atom for which the accessibility is being calculated is analyzed to see what arc length of this circle is overlapped by the intersecting circles of neighboring atoms. The arc distance that remains is then considered to be accessible to the PROBE. The total accessibility is calculated by simply summing the arc distances for all the slices through the particular atom of interest and multiplying by the distance between the slices. This distance is designated by the variable ZSTEP. The smaller the ZSTEP the more slices and the greater the accuracy (also the more computer time). Since the smallest Van der Waals radii are of the order of 1.10 Angstroms and the probe will usually be 1.40 Angstroms the diameter of the smallest effective sphere = 2.0 x (1.40 + 1.10) = 5.0 Angstroms. With a ZSTEP of 0.25 this would give 20 slices through the sphere. This is acceptable for most of the conditions for which this program will be utilized. We recommend that the ZSTEP value be no less than 0.10 and no greater than 0.50 for the reasons of program time and diminished accuracy respectively.

VDWR CHC | RICH

Sets USECHC or USERIC (default USECHC = .TRUE. and USERIC = .FALSE.).
Assign van der Waals radii to atoms based on atom name and residue name. iflag(i,4) is set to negative if atom is not found or residue is not found. If the residue type is not found and the atom is a main chain atom: "C   ", "N   ", "O   ", "CA  ", the radius will be assigned to either the original code values (RICH) or the Cyrus values (CHC) (see reference [2]).

SKIP

Toggles DOCALC off (default DOCALC = .TRUE.). The output file will have the same format either way but one contains area data and the other only dummy entries in those columns. If areas are not needed, then that calculation can be avoided. Unlike the area output file, which lists information for only those atoms for which the area was calculated, the output file when DOCALC = .FALSE. is entered contains all the atoms read from the input file(s). The flags indicating OMIT, INCLUDE or CALCULATE remain intact if you want to specify subsets of atoms for the VOLUME calculation.

This first set of keywords should be terminated with one of the following three keywords:

RUN

Do a calculation. The program reads the input file(s), stores all the data into arrays and assigns the Van der Waals radii. The number of atoms that have been read into the arrays will be printed. If any atoms have not been found in the radius assignment subroutine, the data associated with that atom will be displayed with an annotation as to whether the RESIDUE NAME or ATOM NAME was not found. If you are concerned because atoms have been assigned default radii you should determine why the atom names or the residue names were not found and try to correct the problem. This may mean editing the subroutine responsible for assigning radii to include a new RESIDUE TYPE or ATOM NAME. It may also mean that the format of your input file was not aligned with the expected format.

RERUN

It is assumed that the input files have been read on a previous run, and van der Waals radii assigned. A new calculation on this data is started.

STOP

Stop and finish.

If everything is satisfactory, then the program moves to a second set of keywords.
To allow flexibility and to reduce and eliminate unnecessary repetition of calculations a flag system is utilized. Every atom is assigned an integer FLAG value of -1, 0, or 1 with the following meanings:

-1
the atom is completely ignored during the accessibility calculation. This would be the same as omitting the particular atom from the input coordinate file.
0
the area is to be calculated for this atom. The surrounding consists of all other atoms flagged either "0" or "+1".
1
the atom is considered part of the protein environment but no area calculations will be performed on this atom.
The program will loop through the atoms until an atom has a flag value 0. The program then finds all those atoms with a value of 0 or 1 that fall within the 'touching' distance of the atom for which the calculation is being performed. This distance will vary with the Van der Waals radii and probe radius chosen.

There are three keywords for assigning flags:

ALLATM

Calculate the accessible surface of all atoms read into the data arrays.

PRESET

Only available if the input file(s) were of 'RAD' format. This will take the flag values from the first column of this file and create the indicated subsets based on the standard flag values.

SUBSET

Define a subset of atoms. If you have an interest in a particular group of atoms, they can be specified rather than producing lengthy output files and taking up unnecessary program run time. FLAGS can be used for finding changes in the accessibility of the protein upon the removal of substrate(s) or upon deletion of a section of the protein. They may also be used if you only have an interest in the accessibility of certain RESIDUE types or ATOM types and do not wish to waste time doing the calculation for all atoms in the coordinate list. This option is designed to handle a few of the most logical flag assignments. If you have something that cannot be handled by the SUBSET flag setting subroutine, you can create your own file and set the flags as you wish. The file should then be in the 'RAD' format and you should use the PRESET keyword.

If you choose the SUBSET option, then the following keywords can be used.
First define the flag value to be assigned with one of:

OMIT Assign flag of -1

INCL Assign flag of 1

CALC Assign flag of 0

DONE No more flag assignments, proceed to next step in program.

If one of the first three options is entered you can choose one of six different ways to specify what atoms are to be assigned that particular flag:

FILE Assign flag if atom came from one of two input files. This is not available if only one input file was read.

CHAI(N) Assign flag to an atom if it has the chain number specified. Many times a molecule will consist of two separate chains that are identified in the coordinate list. Two monomers in an asymmetric unit or two subunits in a dimer are usually identified separately.

ZONE Assign flag based on a range of the sequence number. Any atom that has a sequence number greater than or equal to the starting value entered or less than or equal to the ending value specified will be assigned the designated flag. Repeat this operation for as many sequence pairs as required. Entering the same number twice will result in the flag assignment to the single residue specified.

RESI(DUE) Assign flag to an atom if the residue name is the same as the one specified. For instance in a protein you may only be interested in calculating the accessibility of histidine residues. If you enter the standard three letter notation "HIS" all histidines will be assigned the flag value.

ATOM Assign flag to an atom if the atom name is the same as the one specified. This is the same as the "RESI" option, except that atom types are identified.

SERI(AL) Assign flag based on a range of serial numbers of atoms in the coordinate list. An atom of serial number equal to or greater than the starting value entered or less than or equal to the ending value entered will be assigned the designated flag. Repeat through as many serial number pairs as required. Entering the same number twice will result in the flag assignment to the single atom specified.

If you have more than one input file you will be asked for the file number for which flags are to be set.

The ORDER OF OPERATION in setting FLAGS is very important. Any operation that is performed will overwrite the previous flag value assigned to an atom. For example if a zone from 1 to 20 is assigned a FLAG of -1 by using the "OMIT" flag and the "ZONE" operation, and then the "CALC" flag is set to a ZONE from 5 to 10. The end result would be that the ZONE from 1 to 4 is OMITTED, the ZONE from 5 to 10 is CALCULATED and the ZONE from 11 to 20 is OMITTED. If "CALC" was first called for the ZONE from 5 to 10 and the "OMIT" was called for ZONE 1 to 10 all the atoms associated with residues 1 to 10 would be OMITTED.

Having set the flags, one of the following two keywords should be entered:

RESET

Reset the flags with one of the above three keywords (for interactive use).

NEXT

Do calculation, and then return to the first set of keywords. >From there, you can RERUN the calculation, or STOP.

EXAMPLES

Unix examples script found in $CEXAM/unix/runnable/

SEE ALSO

volume, areaimol

REFERENCES

  1. B.Lee and F.M.Richards, J.Mol.Biol., 55, 379 - 400 (1971)
  2. Chothia (1975), "Structural Invariants in Protein Folding", Nature 254: 304-308

AUTHORS

Author: Mark D. Handschumacher and F.M. Richards.
Original and earlier versions produced by: B. Lee, F.M. Richards, T.J. Richmond and J.B. Matthew.
This CCP4 version (partially keyworded) is from Kim Henrick.