Go to the previous, next section.

Structure Generation and Manipulation (The PSF)

The commands described in this node are used to construct and manipulate the PSF, the central data structure in CONGEN. The PSF is comprised of lists giving every bond, bond angle, torsion angle, and improper torsion angle as well as information needed to generate the hydrogen bonds and the non-bonded list. It is essential for the calculation of the energy of the system. A separate data structure deals with symmetric images of the atoms (see section Symmetry and Molecular Images).

There exists one other command for manipulating the PSF, the SPLICE command, see section SPLICE -- Change the Sequence of the PSF.. The SPLICE command can change the sequence of PSF and shuffle the coordinates so that the CONGEN command, see section Conformational Search, which can be used to find conformations for the new amino acids.

There is an order with which commands to generate and manipulate the PSF must be given. First, segments in the PSF must be generated one at a time. Prior to generating any segments, one must first have read a residue topology file, see section Residue Topology Files. To generate one segment, one must first read in a sequence using the READ command, see section Specifying a Sequence of Residues for a Segment. Then, the GENERATE command must be given. RTF's may be changed as needed; however, the SPLICE command will only work correctly when only one RTF is used to build the entire structure.

Once a segment is generated, it may be manipulated. It may be edited using the EDIT command. Cystine bridges may be added using the DISULFIDE command. A histidine heme crosslink may be added using the PATCH HEME command.

The Generate Command - Construct a Segment of the PSF

Syntax

GENErate [segid] [NBXMod int] [rtf-type] [CYCLic]

         [[NO]ANGLes] [NOTORSions      ] [[NO]DONOrs] [[NO]ACCEptors]
                      [TORSions {ALL}]
                      [         {ONE}]

             { PROT }
             { HPRO }
             { ALLH }
rtf-type ::= { DNA  }
             { A94N }
             { A94P }
             { AM94 }

Function

Using the sequence of residues specified in the last READ SEQUENCE command and the information stored in the residue topology file, this command generates the next segment in the PSF. Each segment contains a list of all the bonds, angles, dihedral angles, and improper torsions needed to calculate the energy. It also assigns charges to all the atoms, sets up the nonbonded exclusions list, and specifies hydrogen bond donors and acceptors. If a special type of segment has been specified in the READ SEQUENCE command or by the rtf-type option, modifications for structural features not contained in the residue topology file, for example terminal group modifications and proline modifications, are made automatically. The CYCLIC option controls whether a cyclic structure is built. Cyclic structures are made by omitting any terminal residues, and wrapping references to atoms beyond each end of the segment back to the other end.

The processing of terminal groups varies depending on the rtf-type setting. If a CONGEN topology file is read; rtf-type equals PROT, HPRO, ALLH, or DNA; then extra residues, like NTER and CTER, are added to the sequence. This has the undesirable side-effect of adding extra residues into your sequence, and confusing the residue numbering.

If the AMBER 94 potential is used, a different scheme is used. Here, the topology file, section AMBER94RTF, contains special terminal residues, which have different atoms and charges. The GENERATE command will translate the terminal residues to the names in the topology file, generate the segment, and then translate the names back. The following example table illustrates the naming conventions used for alanine.

ALA
Regular alanine.
NALA
N terminal alanine.
CALA
C terminal alanine.
DALA
D alanine.
MALA
N terminal D alanine.
BALA
C terminal D alanine.

The A94P keyword specifies that an AMBER 94 protein sequence is to be generated, and the A94N keyword specifies a nucleic acid sequence is to be generated.

The GENERATE command is capable of automatically generating some of the information needed to compute the energies from other sources within the PSF. In the case of the AMBER potential, CONGEN sets these options on by default. In addition, the value of these switches is saved in the PSF (in the NICTOT array), so that they can be used by the SPLICE command, see section SPLICE -- Change the Sequence of the PSF.. The automatic generation options have the following interpretations:

ANGLES
NOANGLES
Specifies that bond angles are to be generated from the bond list. All bonds joined by a common atom will be added to the angle list, and all duplicates removed. The keyword, NOANGLES, turns this option off.

TORSIONS
NOTORSIONS
Specifies that torsion angles are to be generated from the bond list. This process begins by CONGEN looping over all bonds in the system. It will then examine all atoms bound to the two atoms in the selected bond, and If the keyword ONE is specified, the program will add a torsion for the first pair of bound atoms which are both non-hydrogen. If no pair of heavy atoms can be found, then the program will select the first pair of bound atoms. If the keyword ALL is specified, then all sets of four atoms connected by three bonds in tandem will be added as torsions. Note that this code detects three membered rings and will ignore them. In addition, you must specify either ONE or ALL if you specify TORSIONS. The keyword, NOTORSIONS, turns off this option.

DONORS
ACCEPTORS
Specifies that hydrogen bond donors and acceptors be added automatically. This depends critically on the atom type specifications in the residue topology file, see section RTF Mass Command. The keywords, NODONORS and NOACCEPTORS, turn off these options, respectively.

NBXMOD
The NBXMOD option controls the automatic generation of non-bonded exclusions, see section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more details.

The actual generation process proceeds in four phases. First, all the atoms specified by the residues are added to the PSF. Next, all the terms are added, and all linkage references can be correctly handled, see section Linkage Atom Naming. Next, the automatic generation operations are performed. Finally, the patches are performed.

NBXMOD -- Automatic Generation of Non-bonded Exclusions

Some pairs of atoms are excluded from the nbond exclusion lists because their interactions are described by other terms in the hamiltonian. By default directly bonded atoms and the 1-3 atoms of an angle are excluded from the nonbond calculation. In addition the diagonal interactions of the six membered rings in tyrosine and phenylalanine and ring atom interactions in tryptophan are excluded in the current topology files. Hydrogen bonds, and dihedral 1-4 interactions are not excluded (note that other workers may differ from us on one or both of these points).

The list of nonbonded exclusions is generated in two steps. First a preliminary list is made at generation by GENIC using any information that may be present in the topology file (as for example might be diagonal interactions in rings). The second step is an automatic compilation of all the bond and angle interactions, followed by a sorting of the list, performed in MAKINB. The list is stored in the linked list pair IBLO/INB, where IBLO(i) points to the last exclusion in INB to atom i. If the list is modified after MAKINB, then either MAKINB should be called again to resort the list, or care must be taken to see that the INB list is ascending with all INB entries having higher atom numbers than i and that all atoms have at least one INB entry.

MAKINB is called by default after any operation which changes internal coordinates such as generate, patch, edit, or splice.

The default list can be modified in three ways. First, interactions that are to be excluded can be placed in the topology file. Second, the NBXMOD option can be specified as a qualifier to any of the commands which change internal coordinates. Its values and actions are:

0
use only the exclusions in the topology file. This option is not recommended, because there are no check to ensure that the non-bonded exclusions for an atom are defined only above each atom. MAKINB will correct this automatically.
1 or -1
include bond interactions automatically.
2 or -2
also include 1-3 interactions automatically.
3 or -3
also include 1-4 interactions automatically.

Note that the 1-3 and 1-4 interactions are determined from examination of the bond list regardless of any torsions or improper torsions which are defined.

Negative values suppress the use of the information present in the topology file. Positive values add to the information that was in the topology file. If NBXMOD is not specified for a command, it defaults to 2.

The third way to change exclusions is the use of the EDIT command, see section EDIT -- Edits the PSF and Hydrogen Bonds).

DISULFIDE -- Creates Internal Coordinates for Disulfide Bridges

Syntax

DISUlfide [NBXMod int]

NCYST      (I5)

IRES,JRES  (2I5)  repeated NCYST times

This command requires formatted (not free field) input following the DISULFIDE line.

Function

The first line following the command gives the number of cystine cross bridges to be made; The lines following give the residue numbers (not residue identifiers) of the cysteines to be linked. The residue number of a residue is its position in the list of all residues in the structure including any special termini which have been added. The linkage process involves adding bonds, bond angles, torsion angles, and non-bonded exclusions for the additional bond. The NBXMOD option controls the automatic generation of non-bonded exclusions, see section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more details.

An attempt has been made to ease the burden of going from residue identifiers to residue numbers which will be different for segments which have N-terminal residues added. Whenever the type of segment as specified in the READ SEQUENCE command is CHARMM explicit hydrogen or all hydrogen, the residue numbers will first be increased by one. If the two residues given are not cysteines, the residue numbers will be decremented by one, and the attempt repeated. If this fails, the command will die. Note that this does not help you if you have more than one segment in your structure.

When using disulphides, it is important that the sequences reference a cysteine residue which is intended to be joined. In an all atom topology file, see section AMBER94RTF, there are two cysteine residues, one of which has a thiol and one of which has a sulfur.

It is not possible to bridge an atom in the primary space with that of a symmetric image using this command (see section Symmetry and Molecular Images).

PATCH -- Patches Special Structures

Syntax

PATCh HEME [NBXMod int]
histidine-heme-spec

         or

PATCh LIGA [NBXMod] int]
carbonmonoxide-heme-spec

Function

PATCH HEME is used to patch the ligation of a histidine to a heme residue. The histidine-heme-spec is a pair of integers read by a format of 2I5, IRES and JRES, are the residue numbers of the histidine and heme, respectively. The bond is formed between the NE2 of the histidine and FE of the HEME. For each bond formed, additional bond angles, torsion angles, and non-bonded exclusions are added.

PATCH LIGAND is used to patch the ligation of a carbon monoxide to a heme. The carbon monoxide-heme-histidine-spec consists of three integers, read by a format of 3I5, and again are not free format. The three numbers refer to the histidine, heme, and CO residue numbers respectively. This patch takes care of the carbon monoxide heme bond. It should be called after PATCH HEME is called.

The NBXMOD option controls the automatic generation of non-bonded exclusions, see section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more details.

WARNING: This code has been tested only with the extended atom topology file. It may not work for current editions of the other amino acid topology files, and it will not work if there is a proton placed on the NE2 of the histidine.

It is not possible to patch any interaction involving atoms of a symmetric image using these command (see section Symmetry and Molecular Images).

EDIT -- Edits the PSF and Hydrogen Bonds)

Syntax

EDIT [NBXMod int]

edit-commands

edit-commands are described below. They must be terminated with an END command. The edit-commands are not free field.

Function

EDIT is used to edit the PSF and also the a hydrogen bond list without explicit hydrogens. The following operations are possible: Any bond, bond angle, torsion angle, improper torsion angle, hydrogen bond donor, hydrogen bond acceptor, non-bonded exclusion, or hydrogen bond may be deleted or added. In addition, the parameter type code, charge, and IUPAC name of any atom may be changed.

The operations with hydrogen bonds and with hydrogen bond donors and acceptors are obsolescent as one cannot add the proton to any new hydrogen bonds, nor can one add the various antecedents to the hydrogen bond donor. At some point, this will be fixed.

The edit-commands are all fixed format commands. Each command except the END command consist of three parts. First, one specifies an alphabetic command using words read with a 2(A4,6X) format. The following line consists of single integer using (I5) format giving the number of changes. Finally, that number of lines follows, where each line specifies one change.

The END command consists of the word END in the first of column of a line with nothing else on the line. This terminates the EDIT command.

The NBXMOD option controls the automatic generation of non-bonded exclusions, see section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more details.

ADD and DELETE -- Modify the Structure Arrays

Syntax

{ ADD    } { BOND     }     (2A10) Fixed format
{ DELEte } { THETa    }
           { PHI      }
           { IMPHi    }
           { HBONd    }
           { DONOr    }
           { ACCEptor }
           { NONBond  }

 NCHANG   (I5)

 NRESI,ATOMI,NRESJ,ATOMJ,NRESK,ATOMK,NRESL,ATOML  (4(I5,1X,A4))
    repeated NCHANG times

Function

Most of the keywords are self explanatory. In the case on NONBOND, the nonbonded exclusions list is changed, not the actual list of nonbonded interactions. NCHANG is the number of elements to be added or deleted. NRESI specifies the residue number of the first atom, and ATOMI specifies its name. For example, deleting a peptide bond between the fourth and fifth residue would be specified as

        4 C       5 N

If a number of different internal coordinates are to be changed, separate ADD or DELE commands must be used. They can appear in any order. One must specify only as many atoms as there are in the interaction. For hydrogen bonds, only two atoms may be specified even though the proton may be explicit represented.

WARNING: This routine does not make the correct checks if you make a mistake. If you specify an atom incorrectly, or if the interaction you wish to delete is not there, the results are unpredictable. At some point, this may be fixed.

MODIFY -- Changes the Attributes of an Atom

Syntax

MODIfy          (A4)    Fixed format

NCHANG  (I5)

NREST,ATOMT,ICODE,ANAME,CHARG  (I5,1X,A4,I5,1X,A4,F10.3)
   repeated NCHANG times

Function

NCHANG is the number of atoms to be modified. A card with the changed values must be read in for each atom to be changed (in any order). NREST and ATOMT are the residue number and atom type of the atom to be changed. ICODE, ANAME and CHARG are the new chemical type code (IAC array entry), new atom type and new atomic charge given to the atom. If a field is left blank, the old value is retained. Editing of this type could be used, for example, to change the carbonyl oxygen of a terminal residue to an atom with attributes corresponding to a carboxyl oxygen (in the standard execution of the program, modifications like this are done automatically if the proper type is specified with the sequence).

Go to the previous, next section.