KEYPARSE (CCP4: Library)

NAME

keyparse - high-level keyword parsing subroutines

DESCRIPTION

keyparse contains subroutines which taken together act as a simplified interface to those in parser.

These routines (actually all entry points to one routine) simplify the work of decoding keyworded input that is of a decently regular, simple form (as it should be). They are a veneer on the PARSER and associated routines, hiding many of the gory details.

Contents

Overview

The idea is to call a routine (MEMOPARSE) to read a keyworded line into a hidden internal buffer and then have a simple chain of calls to a set of routines which check the appropriate keyword, look for any extra arguments associated with it, and set values in their arguments as appropriate. No IF ... THEN ... ELSE (or, much worse, assigned GOTO) is necessary, nor maintaining the parser arrays - the relevant checking is done internally. At the end of the checks, call PARSEDIAGNOSE to print any appropriate messages and loop back to MEMOPARSE if PARSEDIAGNOSE's argument is true (otherwise continue and process the data read in). You don't need to check for `END' or end-of-file.

Escape hatch: use PARSEKEYARG to get all the tokens after a random keyword and call e.g. PARSE (from parser) to deal with them as necessary. This approach is deprecated, however - go for simply-structured input.

Example:

            real rval, cell(6)
            integer ival
            character*80 rest
            logical cont
               ...
         10 call memoparse (.true.)  ! setup and echo i/p
            call parseint ('IVAL', ival)  ! ival.eq.3 after `IVAL 3'
            call parsecell (cell)   ! assign cell from `CELL 20 30 40' etc.
            call parsekeyarg ('FRED', rest)  ! get the full horror in REST
            call parse (rest, ....)
             <firkle with the `parse'd arrays>
             <more call parse>... ...
            call parsediagnose (cont) ! check
            if (cont) goto 10
             <now do something useful with the results...>


List of Subroutines

The subroutines have been broken into three groups: core subroutines, subroutines for dealing with complex situations, and subroutines for dealing with standard keywords.

1. Core subroutines

This group comprises the basic set of keyparser routines.

Routine and arguments Purpose
MemoParse(echo) Call PARSER and stash the returned values away for later testing when the other entrypoints are called. If logical echo is .true. then echo input line to the standard output.
ParseFail() Internal routine?
Parsediagnose(cont) call at end of tests for possible 'Invalid keyword' diagnostic or abort if at EOF and had an error. Continue processing (no EOF)
ParseKey(key,flag) set logical flag to .true. if character key is matched.
ParseKeyArg(key,rest) if key is matched then character rest contains the rest of the line.
ParseInt(key,ival) if key is matched then integer value is returned in ival.
ParseReal(key,rval) if key is matched then real value is returned in rval.
ParseNArgs(key,toks) if key is matched then the number of tokens is returned in integer ntoks.
ParseNInts(key,n,ivals) if key is matched then up to n integers will be returned in array ivals, and n will be returned as the number of integers actually read from the input line.
ParseNReals(key,n,rvals) if key is matched then up to n reals will be returned in array rvals, and n will be returned as the number of reals actually read from the input line.

2. Complex situations

This group includes routines to handle the more complex input structures used by many programs. By setting SUBKEY to a blank, mixed real, integer and character parameters may be read one-at-a-time from a line. By setting SUBKEY to some other value, subsidiary keywords and their arguments may be read. By setting FLAG false, keywords with a variable number or variable type arguments may be read.

Routine and arguments Purpose
ParseSubKey(key,subkey,flag) if key and subkey are matched then flag will be set to .true.
ParseSubKeyArg(key,subkey,nth,rest) if key and subkey are matched then the rest of the input line from the nth position after the subkey is returned in character rest.
ParseSubInt(key,subkey,nth,flag,ival) if key and subkey are matched then the nth integer after the subkey is returned in ival. If logical flag is set to .true. then an error is returned if the value at the nth position is empty or is not an integer.
ParseSubReal(key,subkey,nth,flag,rval) if key and subkey are matched then the nth real after the subkey is returned in rval. If logical flag is set to .true. then an error is returned if the value at the nth position is empty or is not a real.
ParseSubChar(key,subkey,nthmflag,rest) if key and subkey are matched then the nth word after the subkey is returned in rest. If logical flag is set to .true. then an error is returned if there is no character or word at this position.

3. Standard keywords

This group of routines automatically handles a variety of standard keywords.

SYMMETRY <number> | <name> | <operators>
Specifies symmetry in terms of either
<number> spacegroup number e.g. 19;
<name> spacegroup name e.g. P212121;
<operators> explicit symmetry operators e.g.
X,Y,Z * 1/2-X,-Y,1/2+Z * 1/2+X,1/2-Y,-Z * -X,1/2+Y,1/2-Z

RESOLUTION <limit> [ <limit> ]
Specifies resolution limits. If only a single limit is given, it is an upper limit, otherwise the upper and lower limits can be in either order. They are in Å unless they are both < 1.0, in which case they are in units of 4sin2 (theta)/(lambda)2.

CELL a b c [ alpha beta gamma ]
Specifies cell dimensions (in Å) and optionally angles in degrees (which default to 90o).

LABIN <program label>=<file label> ...
Associates the column labels that the program expects with column labels in the input MTZ file. If there is no ambiguity, the program and file labels can be swapped on the other side of the =.

LABOUT <program label>=<file label> ...
Associates column labels in the output file with labels used by the program, similarly to LABIN.

Routine and arguments Purpose
ParseCell(cell) reads an input line of the form
CELL a b c [ alpha beta gamma]
and returns the cell parameters in the array cell.
ParseSymm(spgnam,numsp,pgnam,nysm,nsymp,rsym) reads an input line of the form
SYMM spacegroup_name | spacegroup_no | list_of_sym_ops
and returns spacegroup name spgname, spacegroup number numsp, pointgroup name pgname, number of symmetry operations nsym, number of primitive symmetry operations nsymp, and the symmetry matrices rsym.
ParseReso(resmin,resmax,smin,smax) reads an input line of the form
RESOLUTION resmin resmax
ParseLabin(mtznum,prglab,nprglab) reads an input LABIN line.
ParseLabout(mtznum,prglab,nrprglab) reads an input LABOUT line.

4. Atom selection commands

At present keyparse only supports one atom selection command syntax, through a call to PARSEATOMSELECT. This will decode lines with the following type of atom selection commands:

<keyword> ATOM <inat0> [ [TO] <inat1> ] | RESIDUE [ ALL | ONS | CA ] [ CHAIN <chnam> ] <ires0> [ [TO] <ires1> ]

This is based on the syntax used in atom selection in DISTANG. For the purposes of decoding the selection commands the value of the preceding <keyword> is irrelevant.

The syntax described above is designed to allow selections such as:

... ATOM 1 TO 1000
... ATOM 7 9
... ATOM 10
... RESIDUE 11 TO 22
... RESIDUE 10 CHAIN A
... RESIDUE CHAIN S CA 12 19

The selection will specifies a range either of atom numbers or of residue numbers. In the latter case it can also optionally be used to specify a chain identifier (one character) and/or an ``atom type'' selection keyword:

  ALL   all types of atoms in the selected range
  ONS   only oxygens and nitrogens in the selected range
  CA    only carbon atoms in the in the selected range

The ordering of the RESIDUE subarguments is flexible, so that RESIDUE 1 TO 9 CA CHAIN B is the same as RESIDUE CA CHAIN B 1 TO 9 and so on.

Routine and arguments Purpose
ParseAtomSelect(key, inat0, inat1, ires0, ires1, chnam, imode) reads an input line of the form
<keyword> ATOM <inat0> [ [TO] <inat1> ] | RESIDUE [ ALL | ONS | CA ] [ CHAIN <chnam> ] <ires0> [ [TO] <ires1> ]

If key is matched then returns a range of atom numbers (inat0, inat1), or a range of residue numbers (ires0, ires1) plus optionally:

  • Mode value imode (1=ALL, 2=ONS, 3=CA)
  • chain id chnam
Unset atom/residue numbers are returned as -99, unset chain as an empty string ' ', and unset mode as 1.

Descriptions of the Subroutines

MEMOPARSE

subroutine memoparse (ECHO)

       ECHO (input) LOGICAL
              Input will be echoed iff ECHO is .TRUE.
   
Call this at the head of a loop over potential keywords to read a keyworded record, lex it and stash the tokens.

PARSEDIAGNOSE

   subroutine  parsediagnose (CONT)

       CONT (output) LOGICAL
              Set to .FALSE. if end-of-file or END keyword hasn't
              been reached, i.e. continue  the  loop  over  input
              records if CONT is .TRUE.
   
Call this at the end of the loop over possible keywords. It will provide an 'Invalid keyword' diagnostic if no key- words have been matched or abort if CONT would be returned .FALSE. and there has been some input error (invalid key- word or bad argument).

PARSEKEY

subroutine parsekey (KEY, FLAG)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.
       FLAG (output) LOGICAL
              Updated (to .TRUE.) iff KEY is matched.
   

PARSEINT

subroutine parseint (KEY, IVAL)

       KEY (input) CHARACTER*(*)
              Keyword to match against first four  characters  of
              the first token in the keyworded record.

       IVAL (output) INTEGER
              Updated  to contain the integer value of the second
              token in the record iff KEY is matched,  there  are
              only two tokens in the record and the second repre-
              sents an integer (according to  PARSE).   Otherwise
              if  KEY is matched, an error message will be gener-
              ated.
   

PARSEREAL

subroutine parsereal (KEY, RVAL)

       KEY (input) CHARACTER*(*)
              Keyword to match against first four  characters  of
              the first token in the keyworded record.

       RVAL (output) REAL
              Updated  to contain the floating point value of the
              second token in the  record  iff  KEY  is  matched,
              there  are  only  two  tokens in the record and the
              second  represents  a  real  number  (according  to
              PARSE).  Otherwise if KEY is matched, an error mes-
              sage will be generated.
   

PARSENINITS

subroutine parsenints (KEY, N, IVALS)

       KEY (input) CHARACTER*(*)
              Keyword to match against first four  characters  of
              the first token in the keyworded record.

       N (input/output) INTEGER
              On  input: maximum number of integer values to read
              following KEY.  On output, updated to the number of
              elements of IVALS which were updated.

       IVALS(N) (output) INTEGER
              If KEY is matched and followed only by a number (n)
              of integer values between 1 and the input value  of
              N,  then  the first n elements of IVALS are updated
              with the values  of  the  corresponding  arguments.
              Otherwise an error message will be generated.
   

PARSENREALS

subroutine parsenreals (KEY, N, RVALS)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.


       N (input/output) INTEGER
              On input: maximum number of floating  point  values
              to  read  following KEY.  On output, updated to the
              number of elements of RVALS which were updated.

       RVALS(N) (output) REAL
              If KEY is matched and followed only by a number (n)
              of  floating  point  values between 1 and the input
              value of N, then the first n elements of RVALS  are
              updated  with the values of the corresponding argu-
              ments.  Otherwise an error message will  be  gener-
              ated.
   

PARSESUBKEY

subroutine parsesubkey (KEY, SUBKEY, FLAG)

       KEY (input) CHARACTER*(*)
              Keyword to match against first four  characters  of
              the first token in the keyworded record.

       SUBKEY (input) CHARACTER*(*)
              Subsidiary  keyword to match against the first four
              characters of any other token in  the  record.   If
              SUBKEY is blank, the initial keyword is matched.

       FLAG (output) LOGICAL
              FLAG  is  set .true. iff KEY and SUBKEY are matched
              or KEY is matched and SUBKEY is  blank.   Otherwise
              FLAG is unchanged.
   

PARSESUBINT

subroutine parsesubint (KEY, SUBKEY, NTH, FLAG, IVAL)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.

       SUBKEY (input) CHARACTER*(*)
              Subsidiary keyword to match against the first  four
              characters  of  any  other token in the record.  If
              SUBKEY is blank, the initial keyword is matched.

       NTH (input) INTEGER
              The number of positions after the keyword (and sub-
              keyword  if  present) of the token to be read as an
              integer.

       FLAG (input) LOGICAL
              If the keyword and sub-keyword are matched and FLAG
              is true then the absence of an integer value at the
              given position will cause  an  error.  If  FLAG  is
              false,  the  absence of an integer value will cause
              no action to be taken.

       IVAL (output) INTEGER
              Updated to contain the real value of the NTH  token
              after  the  keyword iff KEY and SUBKEY are matched,
              the NTH token after the sub-keyword exists and rep-
              resents an integer (according to PARSE).
   

PARSESUBREAL

subroutine parsesubreal (KEY, SUBKEY, NTH, FLAG, RVAL)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.

       SUBKEY (input) CHARACTER*(*)
              Subsidiary keyword to match against the first  four
              characters  of  any  other token in the record.  If
              SUBKEY is blank, the initial keyword is matched.

       NTH (input) INTEGER
              The number of positions after the keyword (and sub-
              keyword  if  present) of the token to be read as an
              real.

       FLAG (input) LOGICAL
              If the keyword and sub-keyword are matched and FLAG
              is  true then the absence of a numeric value at the
              given position will cause  an  error.  If  FLAG  is
              false, the absence of a numeric value will cause no
              action to be taken.

       RVAL (output) REAL
              Updated to contain the real value of the NTH  token
              after  the  keyword iff KEY and SUBKEY are matched,
              the NTH token after the sub-keyword exists and rep-
              resents an number (according to PARSE).
   

PARSESUBCHAR

subroutine parsesubchar (KEY, SUBKEY, NTH, FLAG, CVAL)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.

       SUBKEY (input) CHARACTER*(*)
              Subsidiary keyword to match against the first  four
              characters  of  any  other token in the record.  If
              SUBKEY is blank, the initial keyword is matched.

       NTH (input) INTEGER
              The number of positions after the keyword (and sub-
              keyword  if  present) of the token to be read as an
              integer.

       FLAG (input) LOGICAL
              If the keyword and sub-keyword are matched and FLAG
              is  true then the absence of a non-numeric value at
              the given position will cause an error. If FLAG  is
              false  and  there  is  a  numeric expression at the
              given position, the character representation of the
              numeric  expression  is returned.  If FLAG is false
              and there is no expression at the  given  position,
              no action is taken.

       CVAL (output) CHARACTER
              Updated  to contain the real value of the NTH token
              after the keyword iff KEY and SUBKEY  are  matched,
              the NTH token after the sub-keyword exists.
   

PARSECELL

subroutine parsecell (CELL)

       CELL(6) (output) REAL
              Cell parameters.
   
Updates the elements of CELL using RDCELL iff the keyword `CELL' is matched.

PARSESYMM

subroutine  parsesymm  (SPGNAM,  NUMSGP,  PGNAME, NSYM, NSYMP,
       RSYM)

The arguments correspond to those of the same name in PARSESYMM, which is called to update them iff the keyword `SYMM' is matched. NSYM, however, is input only, and set to 0 before calling RDSYMM.

PARSERESO

subroutine parsereso (RESMIN, RESMAX, SMIN, SMAX)

The arguments correspond to those of the same name in RDRESO, which is called to update them iff the keyword `RESO' is matched.

PARSELABIN

subroutine parselabin(MINDX, LSPRGI, NLPRGI)

The arguments correspond to those of the same name in LKYIN, which is called to update them iff the keyword `LABI' is matched.

PARSELABOUT

subroutine parselabout(MINDX, LSPRGO, NLPRGO)

The arguments correspond to those of the same name in LKY- OUT, which is called to update them iff the keyword `LABO' is matched.

PARSEATOMSELECT

subroutine parseatomselect(KEY, INAT0, INAT1, IRES0, IRES1, CHNAM, IMODE)

       KEY (input) CHARACTER*(*)
              Keyword  to  match against first four characters of
              the first token in the keyworded record.

       INAT0 (output) INTEGER
              Lower limit of atom range, or -99 if unset

       INAT1 (output) INTEGER
              Upper limit of atom range, or -99 if unset

       IRES0 (output) INTEGER
              Lower limit of residue range, or -99 if unset

       IRES1 (output) INTEGER
              Upper limit of residue range, or -99 if unset

       CHNAM (output) CHARACTER*1
	      Chain identifier, or ' ' if not set

       IMODE (output) INTEGER
              Set atom type: 1 = ALL
                             2 = ONS
                             3 = CA

AUTHORS

Dave Love, Kevin Cowtan, Peter Briggs

SEE ALSO

parser