Go to the previous, next section.
Six forms of constraints are available in CONGEN: harmonic atom constraints, harmonic dihedral constraints, fixed atom constraints, Nuclear Overhauser Enhancement (NOE) distance constraints from NMR spectroscopy, dihedral angle constraints based on J coupling constants determined by NMR, and fixed bond and angle constraints (SHAKE).
CONStraint HARMonic FORCE real [PRINT] [REF coor-spec del] [MASS] atom-selection
coor-spec is a specification for the coordinates. See section Syntax of READ Command, for the syntax.
Syntactic ordering: HARMonic must follow CONStraint, and FORCE must follow HARMonic.
The potential energy has a harmonic constraint term which allows one to prevent large motions of individual atoms. The form for this potential is as follows for coordinates:
where refx is a reference set of coordinates. If MASS is specified in the command line, then k is multiplied by the mass of the atom resulting in a natural frequency of oscillation for the constraint of sqrt(k) in AKMA units. An atom constrained with MASS FORCE 1.0 will oscillate at 7.6 cycles/picosecond if free of other interactions.
CONGEN supports a number of operations on the coordinate constraints. The constraint for any atom can be set to any positive value (specified by the FORCE keyword followed by the desired value). The reference coordinates can be the current set at the point when constraints are specified (the default) or a set can be read from a coordinate file (specified by REF and a coor-spec). The force constants and reference coordinates can read or written as a unit. The PRINT option prints a list of all the current harmonic constraints that are applied to the system after this command has been executed.
It is important to understand some aspects of how the constraints are set in order to get the most flexibility out of this command. When CONGEN is loaded, each atom has associated with it a harmonic force constant initially set to zero. Each call to the CONSTRAINT HARMONIC command changes the value of this constant for only those atoms specified.
The harmonic constraints may be read and written to files. The file name to be specified in the READ and WRITE command is CONS. The files may be read or written only in binary. The PRINT command will also work for constraints. See section Input-Output Commands, for more details. In addition, one may look at the contributions to the energy in detail using the analysis facility, see section The Analysis Facility of CONGEN.
PRINT specifies that a listing of of all the atoms currently constrained should be printed out. This is done by segments of constrained atoms, which is concise in most cases. Unfortunately in the case of IUPAC specified constraints it is quite verbose.
Using this form of the CONS command, one may put constraints on the dihedral angles formed by sets of any four atoms. The constraints may be set to either a single angle, or to be bounded by two angles (a flat bottom well). The improper torsion potential is used to maintain said angles.
The command for setting the dihedral constraints is as follows:
CONStraint { DIHEdral [BYNUM] dihe-spec FORCE real MINimum real MAXimum real } { CLDH }
Syntactic ordering: DIHEDRAL or CLDH must follow CONSTRAINT, and FORCE and MIN must follow DIHEDRAL.
If BYNUM is specified, then
dihe-spec ::= integer integer integer integer
If BYNUM is not specified, then
dihe-spec ::= atom-spec : atom-spec : atom-spec : atom-spec :
where:
atom-spec ::= segid resid iupac
Note that colons must be used as delimiters following each atom-spec.
DIHEDRAL adds a torsion angle to the list of constrained angles using the specified atoms, force constant, and minimum and maximum dihedral angles. CLDH clears the list of constrained dihedrals so that different angles or new constraint parameters can be specified.
When a range of angles is specified, it is important to keep in mind that torsion angles are periodic. Therefore, the MINIMUM and MAXIMUM bounds are taken literally. Reversing the order of values specified for these variables has the effect of complementing the range of the angular constraints. For example, a specification of MINIMUM 175 MAXIMUM -175 is just 10 degrees, whereas the specification of MINIMUM -175 MAXIMUM 175 is 350 degrees.
In order to simplify the specification of constraints, there are a number of defaulting rules used by this command. First, for each atom which follows the first, you can omit segid's, resid's, and iupac names if they match values from the previous atom. Be careful here if any of the identifiers are also used for higher order structures. For example, if a protein has a segid of 1 and a resid of 1, then specifying a single 1 will be interpreted as a segment identifier. In all cases, use the PRINT CONS command to check that you got what you expected.
Next, omitted FORCE values will be taken from the previous constraint if it exists. MINIMUM and MAXIMUM values will also default to previous values if neither option is specified. However, if either one is specified, but the other is missing, then they default to each other, so that the effect is to use a single point harmonic well.
Do not use a value of -9999 as the minimum or maximum dihedral angle, since the program uses this value to indicate that no angle was found on the command line.
These constraints can be coupled to a conformational search. See the option, EPCONS, in section Miscellaneous Global Options.
The PRINT CONS command, see section PRINT -- Writes Information to Output File (Unit 6), will work for constraints. As of now (October 13, 1992), one cannot analyze the contributions of this term using the analysis facility nor can one read or write the description of this term out. Someday ...
CONS FIX atom-selection-spec [PURG] [BOND] [THET] [PHI] [IMPH]
IMOVE
) which tells the minimization and dynamics algorithms
which atoms are free to move. If atoms are fixed, it is
possible to save computer time by not calculating energy terms which
involve only fixed atoms. The nonbond and hydrogen bond algorithms in
CONGEN check IMOVE
and delete pairs of atoms that are fixed in
place from the nbond and hbond lists respectively. In addition the
PURG or individual energy term options specified with the
CONS FIX command allow all or some of the internal
coordinate energies associated with fixed atoms to be deleted.
Interactions between fixed and moving atoms are maintained.NOTE: Because some energy terms are deleted from fixed systems, the total energy calculated with fixed atoms will be different from the total energy of the same system with all atoms free. The forces on the moveable atoms will however be identical.
The way CONGEN keeps track of fixed atoms is by the IMOVE array in the PSF. The IMOVE array is 0 if the atom is free to move, and has some other value if the atom is fixed. WARNING: the use of IMOVE is not yet universal in CONGEN. At present (November 15, 1990), it is supported for dynamics, all forms of minimization except Newton-Raphson. The vibrational analysis does not yet support it.
NOTE: If you use SHAKE in conjunction with fixed atoms,
you should specify the SHAKE command after you have specified the
fixed atoms. This will prevent SHAKE
from compiling a list of
bonds from which the CONS command will delete after it has a list
of fixed atoms.
If PURGE is specified, every bond, bond angle, torsion angle, or improper torsion involving only fixed atoms is deleted. One can limit this elimination process to one type of interaction by specifying BOND, ANGL, PHI, or IMPH. By playing games with combinations of these commands, one can eliminate whole terms from the energy expression.
Constraints derived from NMR spectroscopy can be incorporated into the CONGEN energy function and used for minimization, simulated annealing using dynamics, or conformational search. The constraints from J-coupling constants can also be used to restrict atom construction in conformational search, see the option, EJCONS, in section Miscellaneous Global Options, for more information.
The user may add either a NOE-derived distance constraint term or J-coupling constraint term to the usual potential function when calculating the energy and forces of a macromolecular system. There are two parts to specifying NMR constraints; the NMRC command and the constraints file. The NMRC command signals the program that the NMR constraints are to to be read in and the parameters associated with the function. The specification of which type of constraint function to add is specified in the constraint file. The constraint file is discussed in the following sections.
It also possible to account for conformational flexibility by using ensemble averaging. This facility is still experimental, and you should talk to Bob Bruccoleri or Keith Constantine for more information before using it.
A detailed listing of the NMR constraints can be obtained using the PRINT NMR command, see section PRINT -- Writes Information to Output File (Unit 6).
The NOE constraint function is designed improve upon a flat bottomed harmonic constraint function. The problem with the harmonic constraints is the high forces placed on atom pairs that are very far from their constraint distance. The potential in CONGEN uses a harmonic potential if the distance is close to correct, or if the distance is less than the lower bound of the constraint. However, at large distance, the constraint force is constant rather than harmonic. The connection between the harmonic section of the potential and the constant force section is done by an inverted harmonic piece. The main advantage of this functional form is that a user can input all constraints at once in the beginning of the run.
The functional form is as follows:
where
The form of this function is given in the following figure where the lower bound is 2 Angstroms, the upper bound is 4 Angstroms, K = 2, SLOPE = 0.5, and FMAX = 4.0. Values for the various distances were computed from these parameters.
NOE constraints can be used in both standard and ensemble-averaged calculations. In a standard calculation, is just the distance derived from the individual structure being refined. In the ensemble average approach, multiple different structures, currently stored as different segments separated in space, are refined simultaneously, with distances for NOE constraints being computed using the following formula:
Each member of the ensemble must have identical atoms and residues as determined by their IUPAC names and residue identifiers.
In cases where the NOE is due to the interaction of motionally averaged or prochiral protons, the interproton distance, is calculated over all possible pairing of two sets of protons. By default, the following expression is used:
where the sum is taken over all possible pairs of atoms involved in the constraint. The constraints are specified using two sets of atom selections, see section Atom Selection, so that any combination of atoms may be specified.
If NOE intensity scaling is done such that the calculated distances should reflect averages instead of sums, the AVERAGE option can be used when specifying the constraint. In this case, the following expression is used:
where N is the number of pairs used in the average. If any atom in the two sets of atoms in the constraint has undefined coordinates, then the interproton distance is omitted from other calculations. Likewise, if all the atoms in the two sets are fixed, see section Fixing Atoms in Place, the distance is ignored.
The J coupling constraint term allows one to incorporate constraints based on scalar J coupling measurements in two different ways, one where all atoms in the J coupling are known, and the other where two measurements are made each involving one proton in a pro-chiral pair, but for which the prochiral assignment is unknown. In this section, the form of the scalar J coupling equation will be discussed first followed by the two forms of the constraint equations.
The J-coupling constraint is based on the Karplus equation.(11)(12)
where
Ensemble averaging can be performed over a set of constraints as long as each constraint belongs to a different segment with the same structure, as described for the NOE ensemble averaging, see section Theory for NOE Constraints. In the case of averaging, J is computed as follows:
where the averages are computed over the members of the ensemble.
There are two ways to incorporate constraints from these scalar coupling constants into the energy function. When all the atoms involved in a scalar coupling constant are known, the following equation can be used:
When one of the atoms involved in the J coupling constant measurement is a prochiral atom and if coupling constants for both prochiral atoms are known but unassigned, then the following form of the equation can be used. In this equation the two J coupling constants are "joined" together, and the constraint function is computed based on relationships involving the sums and magnitudes of the differences of the two J couplings, which obviates the need for stereospecific assignments. In the constraint file, the JOIN command is used to link to measured J constraints together.
In this functional form, the sum term is just a harmonic restraint based on the sum. The difference term is more complex because the sign of the difference depends on the arbitrary choice chirality on the prochiral center. The difference function is harmonic where the magnitude of the calculated difference is bigger than the experimental difference. If the magnitude of the calculated difference is less, then a piecewise harmonic function with a maximum at is used. A additional factor is used for this term to smooth the overall function.
The NMRC command is used to specify the NMR Constraints file along with some of the parameters of the NMR constraint functions.
NMRC [UNIT unit] default: 5 [FMAX real] default: 1 [SLOPe real] default: 1 [NOEWeight real] default: 1 [JWEIght real] default: 1 [KJDIff real] default: 0.2 [ECHO] [j-normalization-options] [noe-proton-averaging-options] [clear-options] [title-options] j-normalization-options ::= [JNORmalize ] [NOJNormalize] [SUM ] noe-proton-averaging-options ::= [AVErage ] [NOAVerage] title-options ::= [TITLe ] [NOTItle] clear-options ::= repeat( CLEAr {Jcoupling} ) ( {Noe } )
The purpose of the keywords is given in the following table:
The NMR Constraint file actually contains the constraints, both NOE and J coupling. Besides the specification of the constraints, the file also contains specifications for the NOE and J coupling weights as well as the segments involved with any ensemble averaging.
The constraint file is a free format file containing a series of commands. Depending on the setting of the TITLE option on the NMRC command, the first lines of the constraint file should contain title lines which end with a * on an otherwise blank line. The file is terminated by either an END command or the physical end of file. You can put the constraint file in line in the CONGEN input file by specifying UNIT 5 on the NMRC command.
The commands are as follows :
TYPE { noe-spec } { J-spec } [ AVErage ] noe-spec ::= NOE [ NOAVerage ] [ SUM ] {PHI} J-spec ::= {J } [COEF1 real] [COEF2 real] [COEF3 real]
When the NOE type is specified, you can control whether multiple proton pairs in a constraint are summed or averaged. The SUM and NOAVERAGE keyword specifies summing, the AVERAGE keyword specifies averaging. This specification takes affect with all succeeding NOE constraint specifications.
The coefficients for the Karplus equation can be changed using the COEFn keyword value pairs. COEF1 is used to change the coefficient on the cos^2 term; COEF2 is used for the cos term, and COEF3 changes the final term. Any changes made to these coefficients will apply to all succeeding J coupling constraints within one invocation of the NMRC command. They will be reset to (6.4, -1.4, and 1.9) at the start of the NMRC command.
WEIGHT [J] [Noe] [real]
The WEIGHT command will define the weights used on succeeding constraints. If the type keywords are specified, then the weight of those constraints will be changed only. If no type keyword is specified, then the current constraint type will be changed.
A convenient way to change weights during the course of simulated annealing protocol is to omit weights from the constraint file, and set them prior to reading the constraint file in.
ENSEmble [EXP real] repeat(segid)
The ENSEMBLE command specifies the use of ensemble averaging. Each segment specified by a segment identifier (segid) will be treated as a conformer in the ensemble. Currently, CONGEN does not automatically ignore interactions between conformers in the ensemble, and therefore, you must take steps to separate the conformers and set the non-bonded cutoffs small enough to avoid interactions, see section Generation of Non-bonded Interactions. Every segment in the ensemble must have identical residue and atom identifiers in the same order for this command to work.
Once the ENSEMBLE command is specified, any NOE constraints involving the first segment specified in the ENSEMBLE command are automatically replicated for all the other segments in the ensemble. Further, you may not specify constraints involving the remaining segments.
The EXP keyword is used to the exponent in the ensemble averaging equation, see section NMR Constraints. The default value is 6.
For NOE constraints:
CONS {LOWEr real} {UPPEr real} atom-selection : atom-selection
For J-coupling constraints:
CONS atom-spec atom-spec atom-spec atom-spec {LOWER real} {UPPER real} [COEF1 real] [COEF2 real] [COEF3 real] atom-spec::= segid resid iupac
The CONS command signals that this line will be a constraint. The syntax of each constraint varies depending on the constraint type as set by the TYPE option, see section Theory for NOE Constraints. In the case of the NOE constraints, you must specify the upper and lower bounds for the distance using the UPPER and LOWER keywords, respectively, and two atom-selections (see section Atom Selection) to specify each half of the constraint pair. Each atom-selection defaults to not selecting any atoms. If each side of the constraint specifies just one atom, then just plain distances are computed. However, if multiple atoms are specified on either side, CONGEN will use summing or averaging to compute the distance. The average will be computed over all possible pairs of atoms between the first and second atom-selections given for each constraint.
The choice of atoms to include for each constraint will depend on your NOE data. If you can assign each proton or atom explicitly, then only single atoms should be used. If there is signal averaging occurring, then use the multiple atom specification to get a more realistic constraint. This choice is independent of whether or not ensemble averaging is subsequently used. Note that ensemble averaging affects how the constraints are interpreted. See the description of ensemble averaging above for more information.
Here are some examples: Suppose that one has an NOE between two protons which have been uniquely assigned. Then, the following command would specify a constraint between two atoms:
CONS ATOM MOL1 2 HA : ATOM MOL1 4 HB1 LOWER 2.0 UPPER 4.0
In this example, there is a distance constraint between the HA of residue 2 in segment MOL1 and the HB1 of residue 4 in segment MOL1 which has a lower bound of 2.0 and an upper bound of 4.0. However, if one wanted to include the sum of all the equivalent beta hydrogens in residue 4 the command line would look like as follows:
CONS ATOM MOL1 2 HA : ATOM MOL1 4 HB* LOWER 2.0 UPPER 4.0
For a J-coupling constraint, the syntax is partially order dependent. The keyword argument pairs can be specified anywhere on the command line, but the four atom-spec's must be specified in the correct sequence. The four atoms must be the four atoms involved in the measured coupling constant. The LOWER option specifies the lower bound of the measured J coupling, and the UPPER option specifies the upper bound. Coefficients for Karplus equation can be changed for this constraint, and any changes will be permanent for all succeeding J's.
JOIN
The JOIN command signals that the next two J coupling constraints are to be "joined" together, and to use the second form of the J constraint as described in the section, section Theory for J Coupling Constraints. It is an error for there to be less than two J constraints in the NMR constraint file after the JOIN command is specified.
END
The END command signals the end of the constraint file to be read in.
Example of NOE Constraints * WEIGHT 1.0 TYPE NOE CONS ATOM MOL1 2 HA : ATOM MOL1 4 HB* LOWER 2.0 UPPER 4.0 CONS ATOM MOL1 3 HN : ATOM MOL1 5 HN LOWER 3.0 UPPER 4.4 WEIGHT 0.5 CONS ATOM MOL1 2 HN : ATOM MOL1 10 HN LOWER 3.0 UPPER 4.0 END
Example of J-coupling Constraints around the chi 1 angle of leucine. * WEIGHT J 1.0 TYPE PHI COEF1 9.5 COEF2 -1.6 COEF3 1.8 JOIN CONS 1 1 HA 1 1 CA 1 1 CB 1 1 HB2 LOWER 12.9 UPPER 12.9 CONS 1 1 HA 1 1 CA 1 1 CB 1 1 HB1 LOWER 3.375 UPPER 3.375 TYPE PHI COEF1 7.2 COEF2 -2.0 COEF3 0.6 JOIN CONS 1 1 C 1 1 CA 1 1 CB 1 1 HB2 LOWER 1.4 UPPER 1.4 CONS 1 1 C 1 1 CA 1 1 CB 1 1 HB1 LOWER 9.8 UPPER 9.8 TYPE PHI COEF1 -3.75 COEF2 0.26 COEF3 -0.54 JOIN CONS 1 1 N 1 1 CA 1 1 CB 1 1 HB2 LOWER -1.3475 UPPER -1.3475 CONS 1 1 N 1 1 CA 1 1 CB 1 1 HB1 LOWER -1.3475 UPPER -1.3475 END
Further examples may be found in the `JTEST*' and `NOETEST*' test cases, see section CONGEN Test Cases.
SHAKE is a method of fixing bond lengths and, optionally, bond
angles during dynamics. The method was brought to CHARMM by Wilfred Van
Gunsteren, and is referenced in J. Comp. Phys. 23, 327 (1977).
When hydrogens are present in a structure, it will allow a five fold
increase in the step size if SHAKE
is used on the bonds.
To use SHAKE
, one specifies the SHAKE command
before any dynamics are run. The SHAKE command has the following
syntax:
SHAKE [ BONH [ BOND [ ANGH [ ANGL ]]]]
BONH specifies that all bonds involving hydrogens are to be fixed. BOND specifies all bonds. ANGH specifies that all angles involving hydrogen must be fixed. ANGL specifies that all angles must be shaken. BOND must be specified if angles are fixed.
When the SHAKE command is used, it will check that there
are degrees of freedom available for all atoms to satisfy all their
constraints. Angles cannot be fixed with SHAKE
if one has
explicit hydrogen arginines in the structure as the CZ carbon has too
many constraints. This is a general problem for any structure which has
too many branches close together.
SHAKE
is not recommended for fixing angles. The algorithm
converges very slowly in the case where one has three angles centered on
a tetravalent atom and the constraints are satisfiable only using out of
plane motions.
The use of SHAKE
modifies the output of the dynamics
command. The number appearing to the right of the step number is the
number of iterations SHAKE
required to satisfy all the
constraints. This number should generally be small.
If atoms are fixed rigidly in place, see section Fixing Atoms in Place, then the SHAKE command should follow the CONS command for to prevent SHAKE from shaking deleted bonds. This will make it run more efficiently.
Each time the SHAKE command is executed, the list of constraints is initialized. Thus, if you wish to eliminate the use of SHAKE, specify a SHAKE command with no arguments.
Many commands in CONGEN operate on a subset of the atoms in the system. The Atom Selection syntax described in this section is generally used to identify the subset.
atom-selection ::= repeat(token)
where token can be:
SHOW ALL *
[ ATOM ] segid* resid* atomname* [ CELL ]
RESName resname* RAMA resname* atomname* RANGe segid1 resid1 atom1 segid2 resid2 atom2 RANGe BYNUm integer integer BYNUm repeat(integer) AROUnd real
{GE} {X} {GT} COOR [ABS] {Y} {LE} real {Z} {LT} {EQ} {NE}
CONT probe cutoff SURF probe cutoff CLEAR BYREs ENTEr OR AND NOT EXCL EXCH int
The selection parser operates like a Hewlett-Packard calculator, performing its operations using Reverse Polish notation using an internal stack. Tokens are parsed and performed from left to right. Operations that select atoms based on the structure simply turn on the flags for the atoms selected; the stack manipulation operators can combine this flags in arbitrary ways. When the selection string is completed, the top of the stack is returned as the selection.
All of the tokens followed by a * are interpreted using wildcard characters as follows:
ALL and * include all the atoms.
ATOM selects atoms by name. Likewise, CELL selects tags by name where the tag substitutes for the atomname in the syntax.
RESNAME selects by residue names (eg, GLY, TRP, etc.)
RAMA selects atoms by IUPAC name and residue name.
RANGE allows for a selection over a range of atoms.
BYNUM allows for a selection by number.
AROUND will add to the selection all the atoms within the distance specified of the atoms currently selected.
COOR will select those atoms whose coordinate component satisfies the given relationship to the number specified.
CONTACT and SURFACE will compute the accessible contact area or surface, respectively, for each atom and will mark all atoms whose value exceeds or equals the cutoff. The probe gives the size of the probe. If zero is used for the probe, then the current default probe size will be used (typically 1.4 Angstrom).
BYRES will include all the atoms in every residue for which at least atom has already been selected.
CLEAR removes (clears) all the atoms from the current selection.
ENTER pushed the current selection into the stack, and initializes the top of the stack to the default value.
OR performs a logical OR operation between the top of the stack and next deepest selection, pops the stack twice, and pushes the result onto the stack.
AND does the same thing as OR except for performing and AND operation.
NOT takes the inverse of the top of the stack.
EXCL is equivalent to NOT AND. In other words, it deletes all the selected atoms from the previous selection on the stack. This operation is provided for simplifying conversions from the previous form of the atom selection syntax.
EXCH i exchanges the top of the stack with the ith deepest element. The top of the stack is numbered 0.
SHOW will print the atoms that are currently included at the point in string parsing where SHOW is encountered. If you want the list that will be returned to the calling routine, make sure SHOW is the last entry in the string.
CLEAR ATOM * * CA will include all C alphas in the list. ALL ENTER CLEAR ATOM * * H* EXCL will include all non-hydrogen atoms. CLEAR RANGE BYNU 1 100 will include atoms number 1 to 100.
CLEAR RANGE MAIN 1 CA MAIN 10 CA ENTER - CLEAR ATOM * * H ATOM * * N ATOM * * O EXCL will include all the atoms from CA of reside 1 to CA of residue 10 in the segment MAIN except atoms H, N, and O.
CLEAR BYNU 1 3 5 7 9 11 13 15 ATOM SOLV * * will include atoms number 1, 3, 5, 7, 8, 11, 13, and 15, and the SOLV segment.
CLEAR ATOM S1 10 * AROUND 8.0 ENTER - CLEAR ATOM S2 * * ATOM S3 * * AND BYRES selects all atoms which are in residues which have atoms in segments S2 and S3 that are within 8.0 A of residue 10 in segment S1.
CLEAR COOR X GE 0 ENTER - CLEAR COOR Y GE 0 AND ENTER - CLEAR COOR Z GE 0 AND ENTER - CLEAR COOR X LE 5 AND ENTER - CLEAR COOR Y LE 6 AND ENTER - CLEAR COOR Z LE 7 AND selects those atoms within a rectangular box between the origin and (5,6,7)
All of the atom selections are interpreted using the
SELCTA
routine in `SELCTA.FLX'. Wildcard interpretation is
handled by the EQSTWC
routine in `STRING.FLX'.
Go to the previous, next section.