Go to the previous, next section.

Constraints

Six forms of constraints are available in CONGEN: harmonic atom constraints, harmonic dihedral constraints, fixed atom constraints, Nuclear Overhauser Enhancement (NOE) distance constraints from NMR spectroscopy, dihedral angle constraints based on J coupling constants determined by NMR, and fixed bond and angle constraints (SHAKE).

Harmonic Atom: Hold atoms in place
Harmonic Dihedrals: Hold dihedrals near selected values
Fixed Atoms: Fix atoms rigidly.
NMR Constraints: NOE distance and J coupling constraints.
Shake Command: Fix bond lengths or angles during dynamics.
Atom Selection: syntax rules for atom specifications.

Restraining Atomic Movements

Syntax of Harmonic Atom Constraints

CONStraint HARMonic FORCE real [PRINT] [REF coor-spec del]
           [MASS] atom-selection

coor-spec is a specification for the coordinates. See section Syntax of READ Command, for the syntax.

Syntactic ordering: HARMonic must follow CONStraint, and FORCE must follow HARMonic.

Function of Harmonic Atom Constraints

The potential energy has a harmonic constraint term which allows one to prevent large motions of individual atoms. The form for this potential is as follows for coordinates:

where refx is a reference set of coordinates. If MASS is specified in the command line, then k is multiplied by the mass of the atom resulting in a natural frequency of oscillation for the constraint of sqrt(k) in AKMA units. An atom constrained with MASS FORCE 1.0 will oscillate at 7.6 cycles/picosecond if free of other interactions.

CONGEN supports a number of operations on the coordinate constraints. The constraint for any atom can be set to any positive value (specified by the FORCE keyword followed by the desired value). The reference coordinates can be the current set at the point when constraints are specified (the default) or a set can be read from a coordinate file (specified by REF and a coor-spec). The force constants and reference coordinates can read or written as a unit. The PRINT option prints a list of all the current harmonic constraints that are applied to the system after this command has been executed.

It is important to understand some aspects of how the constraints are set in order to get the most flexibility out of this command. When CONGEN is loaded, each atom has associated with it a harmonic force constant initially set to zero. Each call to the CONSTRAINT HARMONIC command changes the value of this constant for only those atoms specified.

Other Commands for Harmonic Atom Constraints

The harmonic constraints may be read and written to files. The file name to be specified in the READ and WRITE command is CONS. The files may be read or written only in binary. The PRINT command will also work for constraints. See section Input-Output Commands, for more details. In addition, one may look at the contributions to the energy in detail using the analysis facility, see section The Analysis Facility of CONGEN.

PRINT specifies that a listing of of all the atoms currently constrained should be printed out. This is done by segments of constrained atoms, which is concise in most cases. Unfortunately in the case of IUPAC specified constraints it is quite verbose.

Holding Dihedrals Near Selected Values

Using this form of the CONS command, one may put constraints on the dihedral angles formed by sets of any four atoms. The constraints may be set to either a single angle, or to be bounded by two angles (a flat bottom well). The improper torsion potential is used to maintain said angles.

The command for setting the dihedral constraints is as follows:

Syntax of Dihedral Constraints

CONStraint { DIHEdral [BYNUM] dihe-spec FORCE real MINimum real MAXimum real }
           { CLDH                                                            }

Syntactic ordering: DIHEDRAL or CLDH must follow CONSTRAINT, and FORCE and MIN must follow DIHEDRAL.

If BYNUM is specified, then

dihe-spec ::= integer integer integer integer

If BYNUM is not specified, then

dihe-spec ::= atom-spec : atom-spec : atom-spec : atom-spec :

where:

atom-spec ::= segid resid iupac

Note that colons must be used as delimiters following each atom-spec.

Function of Dihedral Angle Constraints

DIHEDRAL adds a torsion angle to the list of constrained angles using the specified atoms, force constant, and minimum and maximum dihedral angles. CLDH clears the list of constrained dihedrals so that different angles or new constraint parameters can be specified.

When a range of angles is specified, it is important to keep in mind that torsion angles are periodic. Therefore, the MINIMUM and MAXIMUM bounds are taken literally. Reversing the order of values specified for these variables has the effect of complementing the range of the angular constraints. For example, a specification of MINIMUM 175 MAXIMUM -175 is just 10 degrees, whereas the specification of MINIMUM -175 MAXIMUM 175 is 350 degrees.

In order to simplify the specification of constraints, there are a number of defaulting rules used by this command. First, for each atom which follows the first, you can omit segid's, resid's, and iupac names if they match values from the previous atom. Be careful here if any of the identifiers are also used for higher order structures. For example, if a protein has a segid of 1 and a resid of 1, then specifying a single 1 will be interpreted as a segment identifier. In all cases, use the PRINT CONS command to check that you got what you expected.

Next, omitted FORCE values will be taken from the previous constraint if it exists. MINIMUM and MAXIMUM values will also default to previous values if neither option is specified. However, if either one is specified, but the other is missing, then they default to each other, so that the effect is to use a single point harmonic well.

Do not use a value of -9999 as the minimum or maximum dihedral angle, since the program uses this value to indicate that no angle was found on the command line.

These constraints can be coupled to a conformational search. See the option, EPCONS, in section Miscellaneous Global Options.

Other Commands

The PRINT CONS command, see section PRINT -- Writes Information to Output File (Unit 6), will work for constraints. As of now (October 13, 1992), one cannot analyze the contributions of this term using the analysis facility nor can one read or write the description of this term out. Someday ...

Fixing Atoms in Place

Syntax for Fixing Atoms

CONS FIX atom-selection-spec [PURG] [BOND] [THET] [PHI] [IMPH]

Function of Fixing Atoms

This command fixes atoms in place by setting flags in an array (IMOVE) which tells the minimization and dynamics algorithms which atoms are free to move. If atoms are fixed, it is possible to save computer time by not calculating energy terms which involve only fixed atoms. The nonbond and hydrogen bond algorithms in CONGEN check IMOVE and delete pairs of atoms that are fixed in place from the nbond and hbond lists respectively. In addition the PURG or individual energy term options specified with the CONS FIX command allow all or some of the internal coordinate energies associated with fixed atoms to be deleted. Interactions between fixed and moving atoms are maintained.

NOTE: Because some energy terms are deleted from fixed systems, the total energy calculated with fixed atoms will be different from the total energy of the same system with all atoms free. The forces on the moveable atoms will however be identical.

The way CONGEN keeps track of fixed atoms is by the IMOVE array in the PSF. The IMOVE array is 0 if the atom is free to move, and has some other value if the atom is fixed. WARNING: the use of IMOVE is not yet universal in CONGEN. At present (November 15, 1990), it is supported for dynamics, all forms of minimization except Newton-Raphson. The vibrational analysis does not yet support it.

NOTE: If you use SHAKE in conjunction with fixed atoms, you should specify the SHAKE command after you have specified the fixed atoms. This will prevent SHAKE from compiling a list of bonds from which the CONS command will delete after it has a list of fixed atoms.

If PURGE is specified, every bond, bond angle, torsion angle, or improper torsion involving only fixed atoms is deleted. One can limit this elimination process to one type of interaction by specifying BOND, ANGL, PHI, or IMPH. By playing games with combinations of these commands, one can eliminate whole terms from the energy expression.

NMR Constraints

Constraints derived from NMR spectroscopy can be incorporated into the CONGEN energy function and used for minimization, simulated annealing using dynamics, or conformational search. The constraints from J-coupling constants can also be used to restrict atom construction in conformational search, see the option, EJCONS, in section Miscellaneous Global Options, for more information.

The user may add either a NOE-derived distance constraint term or J-coupling constraint term to the usual potential function when calculating the energy and forces of a macromolecular system. There are two parts to specifying NMR constraints; the NMRC command and the constraints file. The NMRC command signals the program that the NMR constraints are to to be read in and the parameters associated with the function. The specification of which type of constraint function to add is specified in the constraint file. The constraint file is discussed in the following sections.

It also possible to account for conformational flexibility by using ensemble averaging. This facility is still experimental, and you should talk to Bob Bruccoleri or Keith Constantine for more information before using it.

A detailed listing of the NMR constraints can be obtained using the PRINT NMR command, see section PRINT -- Writes Information to Output File (Unit 6).

Theory for NOE Constraints

The NOE constraint function is designed improve upon a flat bottomed harmonic constraint function. The problem with the harmonic constraints is the high forces placed on atom pairs that are very far from their constraint distance. The potential in CONGEN uses a harmonic potential if the distance is close to correct, or if the distance is less than the lower bound of the constraint. However, at large distance, the constraint force is constant rather than harmonic. The connection between the harmonic section of the potential and the constant force section is done by an inverted harmonic piece. The main advantage of this functional form is that a user can input all constraints at once in the beginning of the run.

The functional form is as follows:

where

The form of this function is given in the following figure where the lower bound is 2 Angstroms, the upper bound is 4 Angstroms, K = 2, SLOPE = 0.5, and FMAX = 4.0. Values for the various distances were computed from these parameters.

NOE constraints can be used in both standard and ensemble-averaged calculations. In a standard calculation, is just the distance derived from the individual structure being refined. In the ensemble average approach, multiple different structures, currently stored as different segments separated in space, are refined simultaneously, with distances for NOE constraints being computed using the following formula:

Each member of the ensemble must have identical atoms and residues as determined by their IUPAC names and residue identifiers.

In cases where the NOE is due to the interaction of motionally averaged or prochiral protons, the interproton distance, is calculated over all possible pairing of two sets of protons. By default, the following expression is used:

where the sum is taken over all possible pairs of atoms involved in the constraint. The constraints are specified using two sets of atom selections, see section Atom Selection, so that any combination of atoms may be specified.

If NOE intensity scaling is done such that the calculated distances should reflect averages instead of sums, the AVERAGE option can be used when specifying the constraint. In this case, the following expression is used:

where N is the number of pairs used in the average. If any atom in the two sets of atoms in the constraint has undefined coordinates, then the interproton distance is omitted from other calculations. Likewise, if all the atoms in the two sets are fixed, see section Fixing Atoms in Place, the distance is ignored.

Theory for J Coupling Constraints

The J coupling constraint term allows one to incorporate constraints based on scalar J coupling measurements in two different ways, one where all atoms in the J coupling are known, and the other where two measurements are made each involving one proton in a pro-chiral pair, but for which the prochiral assignment is unknown. In this section, the form of the scalar J coupling equation will be discussed first followed by the two forms of the constraint equations.

The J-coupling constraint is based on the Karplus equation.(11)(12)

where

Ensemble averaging can be performed over a set of constraints as long as each constraint belongs to a different segment with the same structure, as described for the NOE ensemble averaging, see section Theory for NOE Constraints. In the case of averaging, J is computed as follows:

where the averages are computed over the members of the ensemble.

There are two ways to incorporate constraints from these scalar coupling constants into the energy function. When all the atoms involved in a scalar coupling constant are known, the following equation can be used:

where

When one of the atoms involved in the J coupling constant measurement is a prochiral atom and if coupling constants for both prochiral atoms are known but unassigned, then the following form of the equation can be used. In this equation the two J coupling constants are "joined" together, and the constraint function is computed based on relationships involving the sums and magnitudes of the differences of the two J couplings, which obviates the need for stereospecific assignments. In the constraint file, the JOIN command is used to link to measured J constraints together.

In this functional form, the sum term is just a harmonic restraint based on the sum. The difference term is more complex because the sign of the difference depends on the arbitrary choice chirality on the prochiral center. The difference function is harmonic where the magnitude of the calculated difference is bigger than the experimental difference. If the magnitude of the calculated difference is less, then a piecewise harmonic function with a maximum at is used. A additional factor is used for this term to smooth the overall function.

Syntax for NMR Constraint Command

The NMRC command is used to specify the NMR Constraints file along with some of the parameters of the NMR constraint functions.

NMRC [UNIT unit]        default: 5
     [FMAX real]        default: 1
     [SLOPe real]       default: 1
     [NOEWeight real]   default: 1
     [JWEIght real]     default: 1
     [KJDIff real]      default: 0.2
     [ECHO]
     [j-normalization-options]
     [noe-proton-averaging-options]
     [clear-options]
     [title-options]

j-normalization-options ::= [JNORmalize  ]
                            [NOJNormalize]

                                 [SUM      ]
noe-proton-averaging-options ::= [AVErage  ]
                                 [NOAVerage]

title-options ::= [TITLe  ]
                  [NOTItle]

clear-options ::= repeat( CLEAr {Jcoupling} )
                        (       {Noe      } )

Keywords for NMR Constraint Command

The purpose of the keywords is given in the following table:

UNIT: Unit number associated with the constraint file name (opened previously) If a UNIT number is not specified it is assumed that the constraints are to be read from the main input file.
FMAX: Maximum force for NOE constraints. This specifies the point in the constraint function when the function turns flatter, and the force begins to be reduced.
SLOPE: The constant force which applies to the outer region of the constraint function. This value should not be larger than FMAX.
NOEWEIGHT: The energy weight assigned to all NOE constraints read in after this point. This weight can also be set in the NOE constraint file.
JWEIGHT: The energy weight assigned to all J coupling constraints read in after this point. This weight can also be set in the NOE constraint file.
KJDIFF: The KJDIFF option sets the value of in the equation for The default value gives only two minima for the torsion angle in the test case in the Constantine et al paper.
ECHO: If present, all lines read from the constraint file are echoed as they are read, and errors from searching atoms in the PSF will be printed.
JNORMALIZE
NOJNORMALIZE: This option controls the normalization of J constraint energy. The default state is normalization off. If it is turned on, then for single J's, CONGEN will calculate the value of over all values of the torsion angle for each J separately, and it will normalize the energy calculations for each of the J's. In the case of joined J's, there are two possibilities. If the first three atoms or the last three atoms of joined J's are the same and the remaining atom is different, the program will assume that the J's represent the calculation of a prochiral group around an center. CONGEN will calculate the maximum of over the range of the first torsion angle, with the second angle being set to 120 degrees plus the first angle. If the three atoms do not match, then the program will calculate the maximum of over all possible values of the torsion angles for both J's. The grid size used for all these calculation is 2 degrees.
SUM
AVERAGE
NOAVERAGE: When the NOE constraints are specified, you can control whether constraints involving equivalent, degenerate, or non-stereospecifically assigned groups of protons are summed or averaged. The SUM and NOAVERAGE keyword specifies summing, the AVERAGE keyword specifies averaging. This specification will affect all succeeding constraints read in by the program.
CLEAR: Normally, the NMRC commands add constraints to the current lists. The CLEAR option will clear the named list before new ones are read in. If you specify NOE, then the NOE constraints are cleared, and if you specify JCOUPLING, then the J coupling constraints are cleared. Both of these keywords may be abbreviated to one letter, and you may specify multiple CLEAR options.
TITLE
NOTITLE: By default, a title, see See section Glossary of Syntactic Terms, is read from any constraint file read from any unit other than the default command input. From unit 5, no title is required. These options can override that default behavior. TITLE specifies that a title will be read from the constraint file, and NOTITLE specifies that no title will be read.

NMR Constraint Files

The NMR Constraint file actually contains the constraints, both NOE and J coupling. Besides the specification of the constraints, the file also contains specifications for the NOE and J coupling weights as well as the segments involved with any ensemble averaging.

The constraint file is a free format file containing a series of commands. Depending on the setting of the TITLE option on the NMRC command, the first lines of the constraint file should contain title lines which end with a * on an otherwise blank line. The file is terminated by either an END command or the physical end of file. You can put the constraint file in line in the CONGEN input file by specifying UNIT 5 on the NMRC command.

The commands are as follows :

TYPE command

The TYPE command specifies the type of constraint which follows in the file. For NOE constraints, it is also used to specify whether individual distances should be averaged or summed, see section Theory for NOE Constraints, for more information. For J coupling constraints, the coefficients of the Karplus equation can be specified.

Syntax

TYPE { noe-spec }
     { J-spec   }

                 [ AVErage   ]
noe-spec ::= NOE [ NOAVerage ]
                 [ SUM       ]


            {PHI}
J-spec ::=  {J  } [COEF1 real] [COEF2 real] [COEF3 real]

Function

The TYPE command specifies the type of constraint file being read, either NOE or one of PHI, PSI, X1S, or X1R, for NOE distance constraints of J-coupling data respectively. You must specify a constraint TYPE or the program will not continue execution. You can freely change the TYPE specification within a file.

When the NOE type is specified, you can control whether multiple proton pairs in a constraint are summed or averaged. The SUM and NOAVERAGE keyword specifies summing, the AVERAGE keyword specifies averaging. This specification takes affect with all succeeding NOE constraint specifications.

The coefficients for the Karplus equation can be changed using the COEFn keyword value pairs. COEF1 is used to change the coefficient on the cos^2 term; COEF2 is used for the cos term, and COEF3 changes the final term. Any changes made to these coefficients will apply to all succeeding J coupling constraints within one invocation of the NMRC command. They will be reset to (6.4, -1.4, and 1.9) at the start of the NMRC command.

WEIGHT command

Syntax

WEIGHT [J] [Noe] [real]

Function

The WEIGHT command will define the weights used on succeeding constraints. If the type keywords are specified, then the weight of those constraints will be changed only. If no type keyword is specified, then the current constraint type will be changed.

A convenient way to change weights during the course of simulated annealing protocol is to omit weights from the constraint file, and set them prior to reading the constraint file in.

ENSEMBLE Command

Syntax

ENSEmble [EXP real] repeat(segid)

Function

The ENSEMBLE command specifies the use of ensemble averaging. Each segment specified by a segment identifier (segid) will be treated as a conformer in the ensemble. Currently, CONGEN does not automatically ignore interactions between conformers in the ensemble, and therefore, you must take steps to separate the conformers and set the non-bonded cutoffs small enough to avoid interactions, see section Generation of Non-bonded Interactions. Every segment in the ensemble must have identical residue and atom identifiers in the same order for this command to work.

Once the ENSEMBLE command is specified, any NOE constraints involving the first segment specified in the ENSEMBLE command are automatically replicated for all the other segments in the ensemble. Further, you may not specify constraints involving the remaining segments.

The EXP keyword is used to the exponent in the ensemble averaging equation, see section NMR Constraints. The default value is 6.

CONS Command

Syntax

For NOE constraints:

CONS {LOWEr real} {UPPEr real} atom-selection : atom-selection

For J-coupling constraints:

CONS atom-spec atom-spec atom-spec atom-spec {LOWER real} {UPPER real}

     [COEF1 real] [COEF2 real] [COEF3 real]

atom-spec::= segid resid iupac

Function of NOE Constraints

The CONS command signals that this line will be a constraint. The syntax of each constraint varies depending on the constraint type as set by the TYPE option, see section Theory for NOE Constraints. In the case of the NOE constraints, you must specify the upper and lower bounds for the distance using the UPPER and LOWER keywords, respectively, and two atom-selections (see section Atom Selection) to specify each half of the constraint pair. Each atom-selection defaults to not selecting any atoms. If each side of the constraint specifies just one atom, then just plain distances are computed. However, if multiple atoms are specified on either side, CONGEN will use summing or averaging to compute the distance. The average will be computed over all possible pairs of atoms between the first and second atom-selections given for each constraint.

The choice of atoms to include for each constraint will depend on your NOE data. If you can assign each proton or atom explicitly, then only single atoms should be used. If there is signal averaging occurring, then use the multiple atom specification to get a more realistic constraint. This choice is independent of whether or not ensemble averaging is subsequently used. Note that ensemble averaging affects how the constraints are interpreted. See the description of ensemble averaging above for more information.

Here are some examples: Suppose that one has an NOE between two protons which have been uniquely assigned. Then, the following command would specify a constraint between two atoms:

CONS  ATOM MOL1 2 HA : ATOM MOL1 4 HB1  LOWER 2.0  UPPER 4.0

In this example, there is a distance constraint between the HA of residue 2 in segment MOL1 and the HB1 of residue 4 in segment MOL1 which has a lower bound of 2.0 and an upper bound of 4.0. However, if one wanted to include the sum of all the equivalent beta hydrogens in residue 4 the command line would look like as follows:

CONS  ATOM MOL1 2 HA : ATOM MOL1 4 HB*  LOWER 2.0  UPPER 4.0

Function of J Coupling Constraints

For a J-coupling constraint, the syntax is partially order dependent. The keyword argument pairs can be specified anywhere on the command line, but the four atom-spec's must be specified in the correct sequence. The four atoms must be the four atoms involved in the measured coupling constant. The LOWER option specifies the lower bound of the measured J coupling, and the UPPER option specifies the upper bound. Coefficients for Karplus equation can be changed for this constraint, and any changes will be permanent for all succeeding J's.

JOIN command

Syntax

JOIN

Function

The JOIN command signals that the next two J coupling constraints are to be "joined" together, and to use the second form of the J constraint as described in the section, section Theory for J Coupling Constraints. It is an error for there to be less than two J constraints in the NMR constraint file after the JOIN command is specified.

END command

Syntax

END

Function

The END command signals the end of the constraint file to be read in.

Examples


  Example of NOE Constraints
  *
  WEIGHT 1.0
  TYPE NOE
  CONS  ATOM MOL1 2 HA : ATOM MOL1  4 HB* LOWER 2.0  UPPER 4.0
  CONS  ATOM MOL1 3 HN : ATOM MOL1  5 HN  LOWER 3.0  UPPER 4.4
  WEIGHT 0.5
  CONS  ATOM MOL1 2 HN : ATOM MOL1 10 HN  LOWER 3.0  UPPER 4.0
  END


  Example of J-coupling Constraints around the chi 1 angle of leucine.
  *
  WEIGHT J 1.0
  TYPE PHI COEF1 9.5 COEF2 -1.6 COEF3 1.8
  JOIN
  CONS 1 1 HA  1 1 CA 1 1 CB 1 1 HB2 LOWER 12.9 UPPER 12.9
  CONS 1 1 HA  1 1 CA 1 1 CB 1 1 HB1 LOWER 3.375 UPPER 3.375
  TYPE PHI COEF1 7.2 COEF2 -2.0 COEF3 0.6
  JOIN
  CONS 1 1 C   1 1 CA 1 1 CB 1 1 HB2 LOWER 1.4 UPPER 1.4
  CONS 1 1 C   1 1 CA 1 1 CB 1 1 HB1 LOWER 9.8 UPPER 9.8
  TYPE PHI COEF1 -3.75 COEF2 0.26 COEF3 -0.54
  JOIN
  CONS 1 1 N  1 1 CA 1 1 CB 1 1 HB2 LOWER -1.3475 UPPER -1.3475
  CONS 1 1 N  1 1 CA 1 1 CB 1 1 HB1 LOWER -1.3475 UPPER -1.3475
  END

Further examples may be found in the `JTEST*' and `NOETEST*' test cases, see section CONGEN Test Cases.

`SHAKE` -- Fixing Bond Lengths Or Angles in Dynamics

SHAKE is a method of fixing bond lengths and, optionally, bond angles during dynamics. The method was brought to CHARMM by Wilfred Van Gunsteren, and is referenced in J. Comp. Phys. 23, 327 (1977). When hydrogens are present in a structure, it will allow a five fold increase in the step size if SHAKE is used on the bonds.

To use SHAKE, one specifies the SHAKE command before any dynamics are run. The SHAKE command has the following syntax:

SHAKE [ BONH [ BOND [ ANGH [ ANGL ]]]]

BONH specifies that all bonds involving hydrogens are to be fixed. BOND specifies all bonds. ANGH specifies that all angles involving hydrogen must be fixed. ANGL specifies that all angles must be shaken. BOND must be specified if angles are fixed.

When the SHAKE command is used, it will check that there are degrees of freedom available for all atoms to satisfy all their constraints. Angles cannot be fixed with SHAKE if one has explicit hydrogen arginines in the structure as the CZ carbon has too many constraints. This is a general problem for any structure which has too many branches close together.

SHAKE is not recommended for fixing angles. The algorithm converges very slowly in the case where one has three angles centered on a tetravalent atom and the constraints are satisfiable only using out of plane motions.

The use of SHAKE modifies the output of the dynamics command. The number appearing to the right of the step number is the number of iterations SHAKE required to satisfy all the constraints. This number should generally be small.

If atoms are fixed rigidly in place, see section Fixing Atoms in Place, then the SHAKE command should follow the CONS command for to prevent SHAKE from shaking deleted bonds. This will make it run more efficiently.

Each time the SHAKE command is executed, the list of constraints is initialized. Thus, if you wish to eliminate the use of SHAKE, specify a SHAKE command with no arguments.

Atom Selection

Many commands in CONGEN operate on a subset of the atoms in the system. The Atom Selection syntax described in this section is generally used to identify the subset.

Syntax of an Atom-Selection

atom-selection ::= repeat(token)

where token can be:

SHOW
ALL
*

[ ATOM ] segid* resid* atomname*
[ CELL ]

RESName resname*
RAMA    resname* atomname*
RANGe   segid1 resid1 atom1     segid2 resid2 atom2
RANGe BYNUm integer integer
BYNUm repeat(integer)
AROUnd  real

               {GE}
           {X} {GT}
COOR [ABS] {Y} {LE} real
           {Z} {LT}
               {EQ}
               {NE}

CONT probe cutoff
SURF probe cutoff
CLEAR
BYREs
ENTEr
OR
AND
NOT
EXCL
EXCH int

Interpretation of Atom Selection Tokens

The selection parser operates like a Hewlett-Packard calculator, performing its operations using Reverse Polish notation using an internal stack. Tokens are parsed and performed from left to right. Operations that select atoms based on the structure simply turn on the flags for the atoms selected; the stack manipulation operators can combine this flags in arbitrary ways. When the selection string is completed, the top of the stack is returned as the selection.

All of the tokens followed by a * are interpreted using wildcard characters as follows:

*: matches any string of characters (including none)
%: matches any single character
#: matches any string of digits (including none)
+: matches any single digit

Selection Tokens

ALL and * include all the atoms.

ATOM selects atoms by name. Likewise, CELL selects tags by name where the tag substitutes for the atomname in the syntax.

RESNAME selects by residue names (eg, GLY, TRP, etc.)

RAMA selects atoms by IUPAC name and residue name.

RANGE allows for a selection over a range of atoms.

BYNUM allows for a selection by number.

AROUND will add to the selection all the atoms within the distance specified of the atoms currently selected.

COOR will select those atoms whose coordinate component satisfies the given relationship to the number specified.

CONTACT and SURFACE will compute the accessible contact area or surface, respectively, for each atom and will mark all atoms whose value exceeds or equals the cutoff. The probe gives the size of the probe. If zero is used for the probe, then the current default probe size will be used (typically 1.4 Angstrom).

BYRES will include all the atoms in every residue for which at least atom has already been selected.

Operators on Selections

CLEAR removes (clears) all the atoms from the current selection.

ENTER pushed the current selection into the stack, and initializes the top of the stack to the default value.

OR performs a logical OR operation between the top of the stack and next deepest selection, pops the stack twice, and pushes the result onto the stack.

AND does the same thing as OR except for performing and AND operation.

NOT takes the inverse of the top of the stack.

EXCL is equivalent to NOT AND. In other words, it deletes all the selected atoms from the previous selection on the stack. This operation is provided for simplifying conversions from the previous form of the atom selection syntax.

EXCH i exchanges the top of the stack with the ith deepest element. The top of the stack is numbered 0.

SHOW will print the atoms that are currently included at the point in string parsing where SHOW is encountered. If you want the list that will be returned to the calling routine, make sure SHOW is the last entry in the string.

Examples of Atom Selections

CLEAR ATOM * * CA                will include all C alphas in the list.
ALL ENTER CLEAR ATOM * * H* EXCL will include all non-hydrogen atoms.
CLEAR RANGE BYNU 1 100           will include atoms number 1 to 100.

CLEAR RANGE MAIN 1 CA MAIN 10 CA ENTER -
CLEAR ATOM * * H ATOM * * N ATOM * * O EXCL
                        will include all the atoms from CA of reside
                        1 to CA of residue 10 in the segment MAIN
                        except atoms H, N, and O.

CLEAR BYNU 1 3 5 7 9 11 13 15 ATOM SOLV * *
                        will include atoms number 1, 3, 5, 7, 8, 11, 13,
                        and 15, and the SOLV segment.

CLEAR ATOM S1 10 * AROUND 8.0 ENTER -
CLEAR ATOM S2 * * ATOM S3 * * AND BYRES
                        selects all atoms which are in residues which
                        have atoms in segments S2 and S3 that are within
                        8.0 A of residue 10 in segment S1.

CLEAR COOR X GE 0 ENTER -
CLEAR COOR Y GE 0 AND ENTER -
CLEAR COOR Z GE 0 AND ENTER -
CLEAR COOR X LE 5 AND ENTER -
CLEAR COOR Y LE 6 AND ENTER -
CLEAR COOR Z LE 7 AND
                        selects those atoms within a rectangular box
                        between the origin and (5,6,7)

All of the atom selections are interpreted using the SELCTA routine in `SELCTA.FLX'. Wildcard interpretation is handled by the EQSTWC routine in `STRING.FLX'.

Go to the previous, next section.