Go to the previous, next section.

Input-Output Commands

The commands described here are used for reading and writing data structures used in the main part of CONGEN. Some of data structures used in the analysis facility may also be read and written, see section Reading and Writing Analysis Data.

READ -- Reads Data from External Sources

This command reads data into the data structures from external sources. The external sources can be either text files (card images for the ancient among us) or binary files. The fortran unit number from which the information is read, is specified with the unit-spec. Specifying UNIT 5 indicates that the data to be read following the command in the input stream.

The precise format of all these files is described only in the source code as that serves as the only definitive, accurate, and up to date description of these formats. The description of the data structures provides pointers to the subroutines which should be consulted, see section Data Structures.

Syntax of READ Command

READ { {RTF [PRINT]         } { FILE unit-spec   } }
     { {PARAmeter parm-opts } { CARD [unit-spec] } }
     { {IC  [APPEnd]        } {       'UNIT 5'   } }
     {                                             }
     { {SEQUence} seq-spec                         }
     { {RESIdue }                                  }
     {                                             }
     { {HBONd     } [FILE] unit-spec               }
     { {PSF       }                                }
     { {CONStraint}                                }
     { {NBONd     }                                }
     {                                             }
     { { IMAGes     } [CARD] unit-spec             }
     {                                             }
     {                      [ MAIN ]               }
     { COORdinate coor-spec [ COMP ]               }
     {                      [ DIFF ]               }

unit-spec ::= UNIT unit-number

parm-opts ::= [NOFAIL] [VERSION int]

             [ST2  int                                              ]
seq-spec ::= [WATEr int                                             ]
             [[source-spec] [unit-spec] [rtf-type] [abbrev-spec] seq-opts]

             
seq-opts ::= [BYATom] [IDREad] [MODEl modelid]
            

                { CARDs                         }
source-spec ::= { brookhaven-name [CHAIN segid]}

                    { BROOkhaven }
brookhaven-name ::= { BRKHvn     }
                    { TAPE       }

             { PROT }
             { HPRO }
             { ALLH }
rtf-type ::= { DNA  }
             { A94N }
             { A94P }

                        {AA }
abbrev-spec ::= [ABBREV {DNA}]
                        {RNA}

coor-spec ::= { FILE unit-spec [IFILE int]            } coor-option
              { CARD [unit-spec] [OFFS  int ]         }
              { IGNO [unit-spec]                      }
              { KONN [unit-spec]                      }
              {                                       }
              { { BROOkhaven } unit-spec [  SEQUence] } brk-option
              { { BRKHvn     }           [NOSEQUence] } 

coor-option ::=  [APPEnd] [INITial] [EXPAnd] [abbrev-spec] atom-selection

brk-option ::=  [CHAIN segid] [IDREad] [MODEl modelid] [ALTErnate identifier]

Syntactic ordering: The second field must be specified as shown.

Specifying a Sequence of Residues for a Segment

The specification of SEQUence or RESIdue causes the program to accept a sequence of residue names to be used to generate the next segment in the molecule. There are four sources of sequence information. The first source is a CONGEN format sequence file which has the following syntax:

title
number-of-residues repeat(residue-names)

The form of the title is defined in the syntactic glossary, see section Glossary of Syntactic Terms. The number of residues is specified on the line following the title in free field format. If the number of residues you specify is less than zero, CONGEN will read residues until it encounters a blank line or end of file. If the number is greater than zero, it will also stop once it has read at least as many residues as you've specified. If the number you specify is zero, you will get a warning message as one common error is to forget the number entirely. In this case, the first residue name will be consumed as the number and converted to zero.

The residue names are specified as separate words, each no longer than 4 characters, on as many lines as are required for all the residues. This sequence may be placed immediately following the READ command if the unit number is 5 or may be placed in a separate file.

The second source of sequences is a CONGEN coordinate file in CARD format. Currently, the BYATom option reads all residues within the file for inclusion in the sequence.

The third source of sequence information is a Brookhaven Protein Data Bank file. The BROOKHAVEN, BRKHVN, and TAPE options allow the sequence to be read from the SEQRES records in a Brookhaven protein data bank coordinate file. (TAPE is used because the Brookhaven protein data bank used to come on a tape.) If the CHAIN option is specified, then only the sequence of chain with the specified segid is read. Otherwise, the sequence of all the chains will be read together. Note that the Brookhaven format only allows single letter chain names, so your segid should only have one character.

Alternatively, the sequence may be read directly from the ATOM records by using the BYATOM option. Under the BYATOM option, if there are insertions or deletions in the within the list of residue idenitifiers, the IDREAD option will read the sequence identifiers, including insertion codes, directly from the Brookhaven file, rather than automatically generating a residue number based on sequential order. It should be noted that currently, the IDREAD option conflicts with the DISULPHIDE command, since this command assumes that the residue identifiers are those generated automatically. The MODEL option may also be used in conjunction with BYATOM to read the sequence from a particular model number in the file. If not specified, the first model in the brookhaven file is used.

The final source of sequences are the two water options. The WATEr option allows a sequence of water molecules to be specified. The integer which follows the keyword gives the number of waters. Likewise, the ST2 option allows ST2 waters to be specified. Obviously, no sequence on separate lines need be given. For CONGEN topology files, a residue named OH2 (or ST2) must be present. For AMBER94 topology file, a residue named HOH must be present. If these residues are missing, the GENErate command called afterwards will fail.

When reading is complete, CONGEN will list all the residues it has read.

The options; PROT, HPRO, ALLH, and DNA; specify what type of CHARMM potential file is being read. They are very important because they specify which patching operations are to take place on the segment once it is generated. The patching operations involve correcting the linkage of prolines, and correcting the charges and chemical types of the ends of the segment. PROT signifies that we are using an extended atom residue topology file as the source of residues. HPRO signifies that we have an explicit hydrogen topology file being used. ALLH signifies that we have an all hydrogen topology file. DNA specifies that we are working with the DNA topology file.

In addition, these options may cause additional residues to be added to the sequence. These additional residues serve to terminate the segment. However, if the segment is generated cyclically (see section The Generate Command - Construct a Segment of the PSF), then no termini will be added. In particular, PROT will add a CTER residue that has the C-terminal oxygen. HPRO and ALLH will add a CTER residue along with an NTER residue that holds two additional hydrogens for the N-terminus. DNA will add a 5TER to the beginning and a 3TER residue to the end of the segment.

If an AMBER94 topology file is being used, then the keywords, A94P or A94N, should be specified to indicate whether a protein or nucleic sequence is being read. Use of these keywords will then result the correct terminal residues being used at the ends of the segment. See section The Generate Command - Construct a Segment of the PSF, for more information about this process.

The ABBREV option allows the specification of residues using one letter abbreviations. When the AA keyword is specified, one letter amino acid codes can be used. For RNA and DNA, one letter nucleotide names will be translated into the appropriate two letter AMBER94 residue names.

Reading coordinates

The reading of coordinates is done with the READ COOR command, and there are several options (which may change over in future versions).

There are four possible file formats that can be used to read in coordinates. They are coordinate binary files, dynamics coordinate trajectories, coordinate card images, and Brookhaven Protein Data Bank files.

For all formats, a subset of the atoms in the PSF may be selected using the standard atom selection syntax. For binary files, this is a risky maneuver, and warning messages are given when this is attempted. Only coordinates of selected atoms may be modified. When reading binary files, or using the IGNORE keyword, coordinate values are mapped into the selected atoms sequentially (NO checking is done!).

The reading of the first two file formats is specified with the FILE option. The program reads the file header to tell which format it is dealing with. The coordinate binary files have a file header of COOR and contain only one set of coordinates. These are created with a WRIT COOR FILE command. The dynamics coordinate trajectories have a file header of CORD and have multiple coordinate sets. These files are created by the dynamics function of the program. To specify which coordinate set in the trajectory to be read, the IFILE option is provided. One specifies the coordinates position within the file. The default value for this option will cause the first coordinate set to be read.

For binary files, the APPEnd command will 'deselect' all atoms up to the highest one with a known position. This is done in addition to the normal atom selection. This is useful for structures with several distinct segments where it is desirable to keep separate coordinate modules.

The CARD file format is the standard means in CONGEN for providing a human readable and writable coordinate file. The format is as follows:

         title
         NATOM (I5)
         ATOMNO RESNO   RES  TYPE  X     Y     Z  (repeated NATOM times)
           I5    I5  1X A4 1X A4 F10.5 F10.5 F10.5

The title is a title for the coordinates, see section Glossary of Syntactic Terms. Next comes the number of coordinates. If this number is zero or too large, the entire file will be read. Finally, there is one line for each coordinate. The coordinates, but not the initial lines, may contain blank lines for readability

ATOMNO gives the number of the atom in the file. It is ignored on reading. RESNO gives the residue number of the atom. It must be specified relative to the first residue in the PSF. The OFFSet option should be specified if one wishes to read coordinates into other positions. The APPEnd option adds an additional offset which points to the the residue just beyond the highest one with known positions. This option also `deselects' all atoms below this residue (inclusive). For example, if one is reading in coordinates for the second segment of a two chain protein using two card files, and the APPEnd option is used, RESNO must start at 1 in both files for the file reading to work correctly.

It should also be remembered that for card images, residues are identified by residue number. This will change someday. What this implies, is that if one wishes to read coordinates from an extended atom (PROT) RTF into a structure using an explicit hydrogen (HPRO) RTF, the OFFSet keyword MUST be used to shift the residue numbers by one, (to make room for the NTER) so that the residues will line up. If the reverse process is required, an OFFSet value of -1 is called for.

RES gives the residue name of the atom. RES is checked against the residue name in the PSF for consistency. TYPE gives the IUPAC name of the atom. The coordinates of an atom within a residue need not be specified in any particular order. A search is made within each residue in the PSF for an atom whose IUPAC name is given in the coordinate file.

The MAXERR option controls how many error messages are printed. Its default value is 10. Normally, the coordinate reader will scan the entire file, and it will list errors as it encounters them, until to the MAXERR limit. At the end of reading, it will terminate execution if any fatal errors were encountered.

The KONN option allows the reading of Konnert Hendrickson format files. The file consists of just atom records where each atom coordinate has the following format:

        Res Segid Resid Iupac X Y Z
     3X,A4,  A1,   A3,   A4,  3F10.5

The four alphabetic fields are left justified by the program so they can be placed anywhere within their columns. If the Segid is not specified, the program will attempt to place the atoms within a segment which is determined by the APPEnd option (above). If APPEND is not specified, then the first segment in the structure will be used. If APPEND is specified, then the first segment which has a residue with all undefined atoms will be used. Blank lines may be specified between coordinates.

Note that the Segid and Resid fields are too small to hold the maximum length values. Truncations will cause unavoidable problems. However, residue identifiers NTE and CTE are extended to NTER and CTER.

The BROOKHAVEN option (or its synonyms, TAPE or BRKHVN) specify that the coordinate file is in the Brookhaven Data Bank format. CONGEN can read the ATOM records for coordinates. However, because the Brookhaven format uses slightly different naming conventions, there are a number of inconsistencies you should be aware of when using this option:

  1. Chain identifiers in Brookhaven are only one character long. In CONGEN, the corresponding segment identifiers are currently four characters. Thus, if you read a Brookhaven file with chain identifiers, you must generate your segments with one character identifiers (see section The Generate Command - Construct a Segment of the PSF). If no chain identifiers are present in the Brookhaven file, then CONGEN will search the coordinates for the first residue which has all undefined atoms. Then, it will add the value you specify for OFFSET to this number, and it will read coordinates into the segment which contains the offsetted residue. Be careful in the case where the terminal atom is undefined because in the protein and DNA cases, that atom is in a residue all by itself.

  2. The sequence number in the Brookhaven data bank is an amalgam of a residue number and an insertion code, whereas in CONGEN, it is a four letter identifier which is usually just the text representation of the sequence number (except for the terminating residues). There are two ways that you can handle this number in CONGEN. The SEQUENCE option, which is the default, causes CONGEN to assume that the atoms are provided in sequence order, and that every change of sequence number or insertion code in the file implies that the next residue is being specified. When insertions and deletions are present, the IDREad option is used to read the both the residue number and the insertion code. Note that this option must used in conjunction with reading the sequence with the IDREad option. If NOSEQUENCE is specified, then the residue number is used, but the insertion code is ignored.

  3. Hydrogen atoms have different specifications. In some cases the final digit of the hydrogen name is placed before the `H'. In others it is placed after. CONGEN will move any digit found before the `H' after the other atoms in the name.

  4. Sometimes, the hydrogen atom attached to the peptide nitrogen is labeled `HN'. If so, it is renamed to `H'.

  5. Atoms at the terminii, such as `NT' and `OT1' are renamed.

  6. Atoms can be defined at different positions in the Brookhaven Data Bank. CONGEN will use the last value found in the file.

  7. If you wish to select a particular alternate location identifier for a set of coordinates, use the ALTERNATE option along with the identifier to be selected.

  8. Multiple models may be handled by using the MODEL option. In these cases, it is necessary to specify the model number you wish to use.

Reading Brookhaven file format is not straightforward, so check the coordinates after they are read to see if there are correct. Energy evaluations (see section Minimization and Dynamics followed by analysis of the geometric terms (see section The Analysis Facility of CONGEN) are a useful way to do this. Also, the brkchm command (see section brkchm -- Converting Brookhaven to Congen Format) is an alternate way of converting Brookhaven files into a form that can be edited.

The IGNORE option allows one to read in a card coordinate file while bypassing the normal tests of the residue name, number, and atom name. When IGNORE is specified in place of card, the identifying information is ignored completely. Starting from the first selected atom, the coordinates are copied sequentially from the file.

Normally, the coordinates are not reinitialized before new values are read, but if this is desired, the INITIALIZE keyword, will cause the coordinate values for all selected atoms to be initialized. Note that only atoms that have been selected, will be initialized. The COOR INIT command provides a more general way to initialize coordinates.

The EXPAnd option should be specified if the following conditions apply:

  1. An explicit hydrogen topology file is being used, and the coordinates we are reading do not have hydrogens in them.
  2. The coordinates were read using the IGNORE option or were read from a binary file.

In this case, the coordinates will be shuffled in order to leave room for the hydrogens. The hydrogen bond generation routine, section HBUILD Command, or the builder routines, section The Internal Coordinate Commands, must be called to construct the positions of these hydrogens.

It is also possible to read coordinates into the comparison (or reference) set using the COMP keyword. The DIFF keyword will read coordinates into the coordinate differences (also referred to as the normal mode arrays). It expected that these "coordinates" are really displacements that will be processed by the vibrational analysis command, see section Vibrational Analysis.

Currently, CONGEN will perform a limited set of name translations on any formatted coordinate reading operation. The isoleucine translations are not needed for the AMBER 94 topology file, see section AMBER94RTF. represent common differences in nomenclature:

Residue
Input => Final

ILE
CD1 => CD
ILE
HD11 => HD1
ILE
HD12 => HD2
ILE
HD13 => HD3
SER
HG1 => HG
OH2
O => OH2

The ABBREV option allows the specification of residue names using one letter abbreviations. When the AA keyword is specified, one letter amino acid codes can be used. For RNA and DNA, one letter nucleotide names will be translated into the appropriate two letter AMBER94 residue names.

Finally, the reading of coordinates is always a tricky business. Although standards exist for naming conventions, there are enough minor variations to make the situation difficult. Always check the structure after reading coordinates to ensure that the geometries and energies are reasonable.

The Format of Parameter Files

In 1995, the parameter file format was completely redone to accomodate the addition of the AMBER potential, see section AMBERPARM. The new format is free field and allows patterns to be supplied for parameters in order to reduce the size of the file and to allow for default parameters to handle molecules which have not been seen before.

For the bond length, bond angle, torsion angle, and improper torsion parameters, CONGEN stores a patterns to match the atom types along with the relevant force field parameters. When the programs needs to calculate the energy of any internal coordinate, it goes sequentially through the patterns, and upon finding the first pattern which matches the atom types of the internal coordinate with the highest "specificity", it uses the corresponding parameters. In this context, "specificity" means the sum of the specificities of each atom type pattern. The specificity of an atom type pattern is 0 if it's a complete wildcard, "*"; 0.5 if any wildcards are present, and 1.0 if there are no wildcards at all. This scheme allows parameters to be specified in different levels of generality, with specific parameters taking precedence over general ones.

In the case of hydrogen bond and non-bonded interactions, all the possible combinations of atom types are computed, and tables for the parameters are constructed. Patterns are used to match against the atom types when the tables are computed.

Parameters are stored as strings with the first character being either "S" or "P", which means string or pattern, respectively. If a pattern is a string, then the program does a simple string comparison; otherwise, a wildcard match is used, see section Interpretation of Atom Selection Tokens. CONGEN checks the patterns you specify to see whether a pattern or string has been specified.

Parameter files can be read in either text or binary format. For text files, the version can be set using the VERSION keyword on the READ PARAMETER command. The default value of 2 specifies the old format. The new version is specified by using a value of 3. For binary files, the header record indicates which version of parameter file is being read.

Parameter File Format

The text format for the parameter file begins with a title, see section Glossary of Syntactic Terms, followed by a set of free field commands, and terminating with the end of the file or an END statement. The purpose of the commands is to fill the various parameter arrays. The commands are described below:

BOND command

Syntax

BOND repeat(word word) FORCe real DISTance real

Function

The BOND command adds bond parameters. The bond energy term for one bond is given by

The force constant is given by the FORCE keyword and the equilibrium bond length is given by the DISTANCE keyword. Each pair of words is treated as a separate entry in the bond parameter arrays, so it is possible to specify the same parameters for many bonds.

ANGLE command

Syntax

{ANGLE} repeat(word word word) FORCe real ANGLE real
{THETA}

Function

The ANGLE command adds angle parameters. The angle energy term is given by Bond angles are defined over triplets of atoms. The force constant is given by the FORCE keyword and the equilibrium angle is given by the ANGLE keyword. Each triplet of words is treated as a separate entry in the angle parameter arrays, so it is possible to specify the same parameters for many angles.

TORSION command

Syntax

{TORSION} repeat(word word word word) repeat(torsion-term)
{PHI    }

torsion-term ::= TERM FORCe real PHASe real PERIod real MULTiplicity int END

Function

The TORSION command adds torsion angle parameters. The torsion angle term has the following form

Torsion angles are defined over quadrulets of atoms, and there can be multiple terms per torsion angle so that complex torsions can be established. Each term is specified by strings beginning with TERM and ending with END. The force constant for each term is given by the FORCE keyword. The phase is given by the PHASE keyword. The periodicity is given by the PERIOD keyword, and limited to values of 1, 2, 3, 4, and 6. The multiplicity is given by the MULTIPLICITY keyword, and is most useful in using the AMBER force field, see section AMBERPARM. At least one term must be specified for a torsion angle. Each quadruplet of words is treated as a separate entry in the torsion parameter arrays, so it is possible to specify the same parameters for many torsions.

IMPROPER command

Syntax

{IMPROPER} repeat(word word word word) FORCe real {PHASe real PERIod real}
{IMPHI   }                                        {MIN real              }

Function

The IMPROPER command adds improper torsion parameters. If the dihedral form of improper torsion is selected, the improper torsion term use the torsion angle term given above. If the harmonic form of the improper torsion is selected, then the improper torsion energy term is given by Improper torsions are defined over quadruplets of atoms. The force constant is given by the FORCE keyword. If the dihedral form of the energy is used, then the phase and period are given by the PHASE and PERIOD keywords, respectively. The multiplicity is set to 1. If the harmonic form is used, then the equilibrium improper torsion angle is given by the MIN keyword. Each quadruplet of words is treated as a separate entry in the improper torsion parameter arrays, so it is possible to specify the same parameters for many improper torsions.

HBOND command

Syntax

HBOND repeat(word word) {EMIN real RMIN real             }
                        {CREPulsive real CATTractive real}

Function

The HBOND command adds hydrogen bond parameters. The form of the hydrogen bond term is given by

There are two different ways to calculate hydrogen bond energies. The form in the old CHARMM potential uses the distance between the heavy atom attached to the donor hydrogen and the acceptor, and angular term based on the heavy atom donor, donor hydrogen, acceptor angle. The form used by the AMBER potential uses the distance between the hydrogen and the acceptor, and no angle term. The DEFAULT command described below allows you to switch from one form to the other.

There are two ways to specify the two coefficients. They may be specified directly using CREPULSIVE to specify the first coefficient, and CATTRACTIVE for the second. The second way is to specify the minimum energy, keyword EMIN, and minimum energy distance, keyword RMIN, and CONGEN will compute the coefficients for you.

The pairs of words in each command specifies pairs of atom type patterns to be used for setting the coefficients. The first pattern in the pair gives the atoms types for the donor, being heavy atom or hydrogen. The second pattern gives the acceptor.

The actual process of setting hydrogen bond parameters is complicated by the requirement for constructing a table of hydrogen bond codes so that hydrogen bond codes can be looked up rapidly. Pseudocode for the operation is as follows:

For Ih = 1 to Number of Hydrogen bond patterns
   For I = 1 to Number of Atom Types (NATC)
      If atom_type(I) matches pattern(1,Ih)
         For J = 1 to NATC
            If atom_type(J) matches pattern(2,Ih)
               HBCODE = I*NATC+J-1
               if (HBCODE is not in current list of HB codes)
                  add new HBCODE and coefficients.
               fi
            fi
         done
      fi
   done
done

NBOND command

Syntax

{NBOND    } repeat(word) [EMIN real       ]
{NONBONDED}              [RADIUS real     ]
                         [ALPHa real      ]
                         [NEFF real       ]
                         [CREPulsive real ]
                         [CATTractive real]

Function

The NBOND command adds non-bonded energy parameters. The nonbonded energy function is Non-bonded energy parameters are specified only by atom types, and mixed parameters are specified using the combination rules in the CHARMM paper, see section Introduction to Congen, for the reference.

In each NBOND command, the words are atom type codes. The options have the following meanings:

EMIN
Minimum van der Waals energy (kcal/mole).
RADIUS
Van der Waals radius (Angstroms).
ALPHA
Atomic Polarizability (cubic Angstroms)
NEFF
Number of effective electroncs (dimensionless)
CREPulsive
Coefficient A above.
CATTractive
Coefficient B above.

The parameters can be specified in three different ways; by the 6-12 coefficients (CREPULSIVE and CATTRACTIVE, by minimum energy (EMIN) and radius (RADIUS), or by radius (RADIUS), number of effective electrons (NEFF), and polarizabilities (ALPHA). The program does not check if you overspecify options, so pick one method and use it consistently.

DEFAULT command

Syntax

DEFAULT [IMPRoper [COSIne  ] [NOSYmmetry] END]
                  [HARMonic] [SYMMetry  ]

        [HBOND [H-A] END]
               [D-A]

        [NBOND [VDW14 real] [EL14 real] [HBEXclude] END]
                                        [HBINclude]

Function

The DEFAULT command is used to set defaults which pertain how some of energy terms are calculated. These defaults are set in the parameter file because the parameters are developed as an integrated whole. Settings in the DEFAULT command are an integral part of any parameter file.

From the syntax, it can be seen that there are three different energy terms to which these defaults can apply. The IMPROPER options control the following aspects of the improper torsion energy:

COSINE
The improper torsion term is calculated using the trigonometric function used for the torsion angle term. This form was developed for the implementation of the AMBER potential.
HARMONIC
The improper torsion term is calculated using the harmonic version of the potential. This form was developed for the CHARMM potential. This is the default.
SYMMETRY
NOSYMMETRY
These keywords control the matching of improper torsions against the parameters. If SYMMETRY is selected, then matching of the four atoms in an improper torsion is attempted in the original order, with the first and fourth atoms swapped, with the second and third atoms swapped, and with all atoms reversed in position. SYMMETRY is used for the CHARMM potential. If NOSYMMETRY is selected, no reorderings are done. NOSYMMETRY is used for the AMBER potential. SYMMETRY is the default.

The HBOND default options control which distance is used in the hydrogen bond energy. If D-A is specified, the distance is calculated between the heavy atom donor and the acceptor, and the angular term is included. In addition, the parameterization is done based on the heavy atom donor and acceptor. This is the CHARMM form. If H-A is specified, the distance is calculated between the hydrogen and the acceptor, and no angular term is included. The parameterization is done based on the hydrogen and the acceptor. The default is D-A

The NBOND default options control scaling for 1-4 interactions and the inclusion of van der Waals energies for hydrogen bond pairs. 1-4 interactions are non-bonded interactions of atoms connected by three bonds (see section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more information). The VDW14 keyword sets the scale factor for the van der Waals energy of 1-4 interactions. The EL14 keyword sets the scale factor for 1-4 electrostatic interactions. The default is 1.0 for both of these scale factors. In the AMBER potential, they are set to 0.5. The HBINCLUDE keyword specifies that van der Waals interactions will be calculated for atoms involved in hydrogen bonds. This is the default. The HBEXLCUDE keyword specifies that van der Waals interactions will be turned off for all possible atom pairs specified as possible hydrogen bonds. This is the default for the AMBER potential. Warning: you must ensure that the hydrogen bond distance cutoff is positive when this option is in use. Otherwise, it is possible to generate infinite energies if a charged hydrogen and its acceptor get too close together.

PRINT command

Syntax

PRINT [ON ]
      [OFF]

Function

The PRINT command turns on the echoing of commands in the parameter file and the display of all non-bonded parameters. It is useful for debugging. It is off by default.

END command

Syntax

END

Function

The END statement terminates the parameter file.

Old Parameter File Format

In the format for text parameter file, the data is divided into sections beginning with a keyword line and followed by data lines. The sections may be arranged in any order, and may divided up as well. Just prefix each set of data with the appropriate keywords. The format for each data section follow along with the necessary keywords. Please look at the parameter input files in the `CGDATA' directory for examples.

Keyword   Format

BOND  -   atom atom force_constant distance
          (2(A4,1X),2F10.0)

THETA -   atom atom atom force_constant theta_min
          (3(A4,1X),2F10.0)

PHI   -   atom atom atom atom force_constant periodicity phi_max
          (4(A4,1X),3F10.0)

IMPHI -   atom atom atom atom force_constant i_phi_min
          (4(A4,1X),2F10.0)

NBOND -   atom polarizability n_effective_electrons vdW_radius
          (A4,1X,3F10.0)

HBOND -   atom atom well_depth distance
          (2(A4,1X),2F10.0)

Note that the data lines are NOT free field. However, you can add comments using the exclamation point, see section Controlling a CONGEN Run. Sections end with the occurrence of another keyword or a line with the word END in columns 1-3, the latter terminating parameter reading. Blank lines are allowed in all the sections.

Some errors in the input file will result in warning messages but not termination of the run.

CONGEN will check for duplicate parameters. If all the corresponding values for a duplicate parameter are the same, then only a warning message is issued. Otherwise, an error message will be issued.

Any errors detected in the reading of the formatted parameter file will result in termination of the run, unless NOFAIL is specified on the READ command.

phi_max is either 0.0 or 180.0 for dihedrals with the minimum staggered or eclipsed respectively.

If successive torsion angle or improper torsion angle parameters are specified with all four atoms and have the same atoms, this is a flag that the energy is to be computed as a sum of these multiple terms. For this special processing to be done, the PSF (or topology file used to generate the PSF) must have successively equal torsion or improper torsion angles which correspond to the parameters. In order to use this option, you must specify NOFAIL on the command line.

NBOND parameters must be present for all of the atom types. The program attempts to check this when reading either card image or binary parameter files.

The Format of a Residue Topology File

Here is a description of what is in residue topology files (as they are stored in text files). You may use this format if you specify the CARD option in the READ command. The format of binary files depends on the current implementation of the RTF data structure. See the file `RTF.FCM' in the CONGEN source directory for more details. These files are read by RTFRDR, a subroutine in RTFIO which should be be consulted for formats and the final word on what is actually done with these files.

The purpose of residue topology files is to store the information for generating a representation of macromolecule from its sequence. For each residue, CONGEN requires a description of all bonds, bond angles, dihedral angles, improper torsion angles, partial charges, chemical types, and hydrogen bond donors and acceptors. By linking residues in the sequence together, segments in the molecule are constructed.

The linkage between successive residues is determined when the segments are generated (see section The Generate Command - Construct a Segment of the PSF). It is specified in the residue by using special prefixes on the atom names which refer to residues either ahead of or behind the current residue. In the case of cyclic segments, the program will wrap references around the cycle.

The residue topology files begin with `rtop'. There are two forms, binary module (`.mod') and card format (usually `.inp'). The card format files are used only for creating binary modules and therefore are structured as input files for CONGEN, beginning with a run title and the command READ RTF CARD, followed by the actual topology file.

The first section of the topology files is a title section in the usual format of up to ten lines delimited by a line containing only a * in column 1.

The next line is a set of up to 20 numbers of which the first number gives the topology file format version number. This number be set to 200 for CONGEN to read the remaining file correcting in free field format. If some other number is present or the number is missing, the program will attempt to read the topology file in the current format.

The remaining information is read in free field format as commands to define the RTF. The ordering of the commands is important in that some information is needed to define others (i.e. the atoms of a residue should be defined before the bonds between them). The recommended structure of this file is:

Initial setup:
        TYPE declaration
        MASS specification for each atom type (also hydrogen
             bond donor and acceptor classifications)
        DECLarations of out of residue definitions
        ORDEr specification for atom order.
        SET  command for charge patching
For each residue:
        RESIdue name and total charge specification
        ATTRibute option to specify
        ATOM definitions within this residue
        BOND specifications
        ANGLe specifications
        DIHEdral angle specifications
        IMPRoper dihedral angle specifications
        DONOr specifications
        ACCEptor specifications
        BUILD information
        GROUPing definitions
        GENErate options
        COPY option to copy information from other residues
Closing:
        END statement
Display control:
        PRINT option

The format above is not rigid.

There exists the facility to automatically generate the bond angles, torsion angles, hydrogen bond donors and acceptors, and the BUILD information. It is also possible to delete terms that are generated automatically, and therefore, it is possible to correct any deficiencies in the automatic schemes.

Linkage Atom Naming

It is important to understand how to make references to adjacent residues when constructing a topology file. The following table lists the possible prefixes which may used. Note that the actual atoms referenced are not determined until the generation of segments, see section The Generate Command - Construct a Segment of the PSF.

+
The next residue in the sequence.

-
The previous residue in the sequence.

#
Two residues forward in the sequence.

=
Two residues backward in the sequence.

+n
n residues forward in the sequence. Note that + sign is mandatory.

-n
n residue backward in the sequence.

No reference will be made beyond the end of a segment. In the case of cyclic peptide, CONGEN will wrap around the sequence in the appropriate direction.

RTF Type Command

Syntax

             { PROT }
             { HPRO }
TYPE         { ALLH }
             { DNA  }
             { UNKN }

Function

This option sets the type of the residue topology file. When a segment is generated, the program will check to see that the sequence type matches the RTF type, provided that a sequence type is specified.

RTF Mass Command

Syntax

MASS int word real [ACCEptor]
                   [DONOr   ]

The MASS command specifies the chemical types of atoms, their names, their masses, and optionally whether they are hydrogen bond donors or acceptors. This command is one of the most important in the topology file because is specifies all the permissible atoms in any system.

The int is the numerical chemical type code as used in the parameter file, see section The Format of Parameter Files. Its value may not exceed the parameter MAXATC, which is currently 100. The word is the chemical type name, and this symbol is used in the parameter file. It can also be referenced when analysis tables are built, see section Syntax of the BUILD Command. The real number specifies the mass of each atom type in Atomic Mass Units. Finally, the optional keywords, ACCEPTOR and DONOR, indicate when the atom can participate in a hydrogen bond as an acceptor or donor, respectively. These finally keywords are used only by the RTF GENERATE command, see section RTF Generate Command.

RTF Declare Command

Syntax

DECLARE word

Function

When a formatted RTF file is read, CONGEN will check to see that all components of a residue refer to atoms within that residue, and will issue an informational message if they are not. However, since all polymeric structures will have linkages between residues, there will be atoms which refer outside of each residue.

The DECLARE command informs the program that atoms whose name is word is a linkage atom, and that CONGEN should not issue a message for such messages. Aside from these messages, the DECLARE command is optional.

RTF Set Command

Syntax

SET word real

Function

When residues in the topology file are patched when a segment is made from them, some atomic charges must be adjusted. Currently, the program calculates the correct charges for the explicit hydrogen and extended atom topology files, see section Residue Topology Files. However, with other topology files, it does not.

The SET command provides a mechanism for assigning these charges in the topology file. The user specifies a variable name as the first operand in the command, and a charge value or adjustment in the second. These variable name, charge value pairs are stored with the topology file. If no variable name is available, then a default value is used which will give the correct values for the explicit hydrogen topology files.

The following table gives the currently used variables, default values, and their meaning.

Variable       Default  Meaning

CG_C3_PRIME     0.084   Charge increment for the 3' carbon in DNA.
CG_C5_PRIME     0.092   Charge increment for the 5' carbon in DNA.
CG_DISU_CB      0.022   Charge for the beta carbon in a disulphide linkage.
CG_DISU_SG     -0.032   Charge for the gamma sulfur in a disulphide linkage.
CG_FIRSTCA      0.020   Charge increment for the amino terminal alpha carbon.
CG_FIRSTN       0.710   Charge increment for the amino terminal nitrogen
                        in a protein.
CG_LASTC        0.030   Charge increment for the carboxy terminal carbonyl
                        carbon.
CG_LASTO       -0.200   Charge increment for the carboxy terminal carbonyl
                        oxygen.
CG_O3_PRIME     0.163   Charge increment for the 3' oxygen in DNA.
CG_O5_PRIME     0.147   Charge increment for the 5' oxygen in DNA.
CG_PRO_FIRSTN   0.085   Charge of the amino nitrogen in an amino-terminal
                        proline. NOTE: the presence of this variables
                        signifies that prolines get special treatment.
                        If you omit it, then none of the CG_PRO variables
                        will have any effect.
CG_PRO_FIRSTCD  0.079   Charge of the delta carbon in an amino-terminal
                        proline.
CG_PRO_FIRSTCA  0.095   Charge of the alpha carbon in an amino-terminal
                        proline.
CG_PRO_IHT1     0.225   Charge of the first amino terminal peptide hydrogen
                        in an amino-terminal proline.
CG_PRO_IHT2     0.225   Charge of the second amino terminal peptide hydrogen
                        in an amino-terminal proline.

This particular mechanism used for charged should be viewed a temporary measure. There are plans to replace it with a more robust patching scheme.

RTF Order Command

Syntax

ORDER repeat(word)

Function

Certain operations within CONGEN expect the atoms in a residue to appear in a given order. Examples of such commands are the RANGE options in an atom selection, see section Atom Selection, or the reading of binary coordinate files, see section Reading coordinates. The ORDER command permits you to specify the atom order for all current and succeeding residues in the topology file.

Each word in the ORDER command is interpreted as a wild card atom selection token, see section Atom Selection. When each residue is completed, all the atoms in the residue are matched against the words in the ORDER command. An exact match takes precedence over a wildcard match, and the last match takes precedence over earlier matches. The order of the atoms in the residue is then rearranged based on the matches into these words. If a set of atoms match the same word in the ORDER command, then these will be ordered according to their original order in the topology file.

As an example, the command

ORDER N H CA * C O

will put the amino nitrogen and hydrogen and the alpha carbon of an amino acid first, the sidechain atoms in the middle, and the carbonyl carbon and oxygen at the end of each residue.

The ORDER command takes effect starting with the next residue completed. Thus, at the beginning of a file, it affects the entire file. If specified after a RESIDUE command, it will affect that residue.

RTF Residue Command

Syntax

RESIDUE word [real]

Function

The RESIDUE is used to start a new residue. If another residue has already been specified, then it is completed. The word in the command specifies the residue name, and the optional real number specifies the total electrostatic charge of the residue. If no total charge is specified, then 0.0 is assumed.

RTF Copy Command

Syntax

COPY word [INVERT]

Function

The COPY command is used to copy the information stored for a previous residue into the current residue. The name of the previous residue is given by the word following the COPY verb. It can only be used at the beginning of a residue specification. The INVERT option cause the program to invert all the torsion angles specified in the BILD commands, see section RTF Build Command.

The COPY command is most useful when two residues are nearly identical, and you do not wish to keep multiple copies of identical information. For example, it is used in the specification of D amino acids.

RTF Attribute Command

Syntax

ATTRIBUTE [DELETE] repeat(word)

Function

The ATTRIBUTE command is used to specify residue attributes. Each word following the ATTRIBUTE command is added to the residue attribute list, unless the DELETE option is specified. In that case, the words are deleted from the residue attribute list.

At present, the only residue attribute that has any significance is the D attribute, which informs the conformational search code, see section Conformational Search, that the residue is a D amino acid, and should be processed accordingly.

RTF Atom Command

Syntax

ATOM iupac word real repeat(iupac)

Function

The ATOM command specifies the atoms in a residue. The first word in the ATOM specifies the IUPAC name of the atom. The second word in the atom specification gives the chemical type code as specified in the second operand of the MASS command, see section RTF Mass Command. The third operand specifies the partial charge of the atom as a real number.

Finally, the remaining words specify the names of atoms which are to be excluded from non-bonded interactions. Such exclusions are made because they are directly bonded or separated by only two covalent bonds. Note that 1-2 and 1-3 non-bonded exclusions can be constructed automatically at segment generation time, see section NBXMOD -- Automatic Generation of Non-bonded Exclusions.

RTF Bond Command

Syntax

BOND [DELETE] repeat(iupac iupac)

Function

The BOND command is used for specifying the bonds in a residue. Each pair of iupac atom names specifies a bond. Atoms outside the current residue can be specified using the scheme described in section Linkage Atom Naming.

The DELETE option causes CONGEN to delete the named bonds from the current residue. This option is useful when the COPY command is used.

RTF Angle Command

Syntax

{ ANGLE } [DELETE] repeat(iupac iupac iupac)
{ THETA }

Function

The ANGLE command, and its synonym, THETA, are used to specify bond angles in a residue. Each triple of iupac names specifies a bond angle. The RTF Generate Command, see section RTF Generate Command, can be used to generate the bond angles within a residue automatically, but bond angles involving atoms outside the current residue must always be specified "by hand".

The DELETE option causes CONGEN to delete the named angles from the current residue. This option is useful when the COPY command or the automatic generation of angles is used.

RTF Torsion Command

Syntax

{ TORSION  } [DELETE] repeat(iupac iupac iupac iupac)
{ DIHEDRAL }

Function

The TORSION command, and its synonym, DIHEDRAL, are used to specify torsion angles in a residue. Each quadruple of iupac names specifies a torsion angle with the middle pair of atoms defining the bond being rotated (and used to chose parameters). When the parameter file contains dihedral angles specified by all four atoms, every dihedral angle is first checked to see if it matches any of this type. If so, then the four atom parameters values are used. If a particular four atom dihedral is specified twice in adjacent positions, then it is assumed that the corresponding parameter file specifies two separate parameter values for this four atom dihedral, and both will be used.

The RTF Generate Command, see section RTF Generate Command, can be used to generate the torsion angles within a residue automatically, but torsion angles involving atoms outside the current residue must always be specified "by hand".

The DELETE option causes CONGEN to delete the named torsion angles from the current residue. This option is useful when the COPY command or the automatic generation of torsion angles is used.

RTF Improper Command

Syntax

{ IMPROPER } [DELETE] repeat(iupac iupac iupac iupac)
{  IMPHI   }

Function

The IMPROPER command, and its synonym, IMPMI, are used to specify improper dihedrals in a residue. Each quadruple of iupac names specifies an improper dihedral with the first atom is the atom being kept planar or chiral. As with the proper dihedral, four atom specifications may be used and when an improper dihedral is repeated, multiple parameter values will be sought.

There does not exist any automatic mechanism for constructing improper dihedrals. It is entirely a function of the parameters used to build the residues.

The DELETE option causes CONGEN to delete the named improper dihedral angles from the current residue. This option is useful when the COPY command is used.

RTF Donor Command

Syntax

DONOR [DELETE] [ iupac ] iupac [ iupac iupac ]

Function

The DONOR command specifies hydrogen bond donors. The only required operand is the heavy atom donor (such as an amide nitrogen). If a hydrogen is present, then it must be specified before the heavy atom donor. In the case of proteins, two antecedent atoms must also be specified. This antecedents are used by the hydrogen construction routine, HBUILD, to construct the hydrogens automatically.(2)

Note that if the first atom is outside of the current residue, CONGEN will assume it is a hydrogen. Also note that only one hydrogen donor can be specified at a time. This is different than the previous commands.

Hydrogen bond donors can be automatically constructed, but only within the current residue, and without the construction of antecedents. The DELETE option may be used to remove hydrogen bond donors introduced by either the automatic generation scheme, or by the COPY command.

RTF Acceptor Command

Syntax

ACCEPTOR [DELETE] iupac [iupac [iupac] ]

Function

The ACCEPTOR command specifies the acceptors of hydrogen bonds. The first IUPAC name specifies the acceptor. Antecedents to this acceptor are stored, but they are not used anywhere.

The DELETE option may be used to remove an automatically generated acceptor or one copied using the COPY command.

RTF Build Command

Syntax

{ BILD  }
{ BUILD } [DELETE] iupac iupac iupac iupac real real real real real

Function

The BILD command is used to specify the rules for constructing atoms. The first four operands are atom names, and they specify a set of four atoms which are linked together, which are referred to as I, J, K, and L. If the third atom is prefixed by an asterisk, then the rule is an improper torsion, where atom K is in the middle, and atoms I, J, and L are bound to it. Otherwise, the rule is a proper torsion, and the four atoms are bound together in a chain.

In the case of a proper torsion, the five real numbers specify the bond length between I and J, the bond angle between I, J and K, the torsion angle between I, J, K, and L, the bond angle between J, K, and L, and the bond length between K and L. In the case of an improper torsion, the five real numbers specify the bond length between I and K, the bond angle between I, K, and J, the improper torsion between I, J, K, and L, and bond angle between J, K and L, and the bond length between K and L. Units for bonds are in Angstroms; units for angles are in degrees.

In either case, the rules can be used to construct the positions of atoms I or L depending the positions of the remaining three atoms. If either bond length or either bond angle is specified as 0.0, then the program will search the parameter files for the values to use. The torsion angles are used as specified, unless they are edited by an internal coordinate editing command, see section The Internal Coordinate Commands.

The DELETE option may be used to remove a BILD rule either automatically generated, or copied using a COPY command.

RTF Group Command

Syntax

GROUP name first-atom-iupac last-atom-iupac

Function

The GROUP command was incorporated to provide electrostatic groups. It is not used.

RTF Generate Command

Syntax

                 { ANGLES               }
                 { THETAS               }
                 { { TORSIONS } [ ALL ] }
GENERATE repeat( { { PHIS     } [ ONE ] }
                 { DONORS               }
                 { ACCEPTORS            }
                 { BILDS                }
                 { BUILDS               }
                 { ALL                  }

Function

The GENERATE command may be used to automatically generate parts of the topology file. Angles, torsions, donors, acceptors, and IC constructors may be generated either singly or in any combination. ALL specifies that all of these entities be constructed. Automatic generation is only performed with the atoms defined within a residue. Atoms involved with linkages to other atoms are not used in the automatic generation process, and must be generated "by hand".

The bond angle construction works by examining the current list of bonds for the current residue, and generating angles for all pairs of bonds. The torsion angle construction can work in either of two ways. If the ALL suboption is specified, then for each bond, a torsion angle is constructed for all combinations of atoms attached to this torsion. If the ONE suboption is specified, the program will first look for a pair of heavy atoms attached to the central bond, and failing that, it will repeat the search looking at all atoms including hydrogens. The donors and acceptors generation depend on the use specifying which atom types are donors or acceptors in the MASS command above.

The generation of IC constructors is the most difficult generation task, and it is done primarily to provide a set of constructors which can be edited for better results. The program will first find three atoms connected together. Next, it will look at the central atom and see if an adjacent atom needs to be constructed. If so, it will generate a improper torsion constructor. It will then recursively try the new atom in conjunction with the previous two. Once these constructions complete, the program will attempt to add atoms on each end of the original three atoms, and will recurse on any successful constructions.

Because not all of these automatic generation commands work perfectly, there exists two mechanisms to edit the results. First, the RTF may be written out in free field format using a WRITE RTF CARD command and the result can be edited. Second, the DELETE options in other commands may be used to delete particular entries after they are generated. Also, additional entries for a residue may be made after the GENERATE command is given.

Note that non-bonded exclusions are generated automatically by default. See section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more information.

RTF Print Command

Syntax

PRINT { ON  }
      { OFF }

Function

The PRINT command may be used to control the display of lines as they are read by the RTF reader. The initial setting for printing is controlled by the READ command itself. If PRINT is specified, then printing will initially be enabled; otherwise, the commands will not be echoed. PRINT ON turns on echoing of RTF specifications; PRINT OFF turns them off. This command is useful for debugging an addition to a previously tested topology file.

An Example RTF File

This is a small RTF example.

*  title for documentation example
*
  200    1
TYPE HPRO
MASS     1 H      1.00800
MASS    11 C     12.01100
MASS    12 CH1E  13.01900
MASS    13 CH2E  14.02700
MASS    14 CH3E  15.03500
MASS    31 N     14.00670
MASS    38 NH1   14.00670
MASS    51 O     15.99940
MASS    56 OH2   15.99940

DECL -C
DECL -O
DECL +N
DECL +H
DECL +CA

RESI ALA     0.00000
ATOM N    NH1    -0.20000     H    CA   CB   C
ATOM H    H       0.12000     CA
ATOM CA   CH1E    0.07500     CB   C    O    +N
ATOM CB   CH3E    0.02000     C
ATOM C    C       0.35000     O    +N   +H   +CA
ATOM O    O      -0.36500     +N
BOND N    CA        CA   C         C    +N        C    O         N    H
BOND CA   CB
THET -C   N    CA             N    CA   C              CA   C    +N
THET CA   C    O              O    C    +N             -C   N    H
THET H    N    CA             N    CA   CB             C    CA   CB
DIHE -C   N    CA   C         N    CA   C    +N        CA   C    +N   +CA
IMPH N    -C   CA   H         C    CA   +N   O         CA   N    C    CB
DONO H    N    -C   -O
ACCE O
BILD -C   CA   *N   H      0.0000    0.00  180.00    0.00   0.0000
BILD -C   N    CA   C      0.0000    0.00  180.00    0.00   0.0000
BILD N    CA   C    +N     0.0000    0.00  180.00    0.00   0.0000
BILD +N   CA   *C   O      0.0000    0.00  180.00    0.00   0.0000
BILD CA   C    +N   +CA    0.0000    0.00  180.00    0.00   0.0000
BILD N    C    *CA  CB     0.0000    0.00  120.00    0.00   0.0000

RESI OH2     0.00000
ATOM OH2  OH2    -0.40000     H1   H2
ATOM H1   H       0.20000     H2
ATOM H2   H       0.20000
BOND OH2  H1        OH2  H2
THET H1   OH2  H2
DONO H1   OH2
DONO H2   OH2
ACCE OH2

END

Reading Other Files

The parameter files (PARAMETER) and internal coordinate files (IC) can be read as card images or binary files. Specifying CARD signifies card image input; specifying FILE signifies binary file input. Please note that topology file must be read in before the parameters can be read. More information about the Internal Coordinate files can be found in their I/O routines, READIC and WRITIC, in the file `intcor.flx'.

Hydrogen bond (HBOND), protein structure files (PSF) files, harmonic constraints (CONSTRAINT), and non bonded lists (NBOND) can only be read as binary files.

The Image file (IMAGES) containing transformation information can only be read in card image format. This is not to be confused with the Images data structure (see section Symmetry and Molecular Images).

WRITE -- Writes Data Structures to External Files

Syntax

WRITe { { PSF        } [FILE]                 } UNIT unit-number
      { { HBONd      }                        }
      { { PARAmeter  }                        }
      { { NBONd      }                        }
      { { CONStraint }                        }
      {                                       }
      { { RTF        } [FILE]                 }
      {                [CARD]                 }
      {                                       }
      { { IC         } [CARD]                 }
      {                [FILE]                 }
      {                                       }
      { { COORdinate } [CARD      ] coor-spec }
      {                [FILE      ]           }
      {                [KONNert   ]           }
      {                [BROOkhaven]           }
      {                [BRKHvn    ]           }
      {                                       }
      { IMAGes         [CARD]                 }

title

coor-spec:== { [MAIN] }  [ OFFS int ] [ HNUSe ] [ WRAP ] atom-selection
             {  COMP  }               [NONHUSe] [NOWRAP]
             {  DIFF  }

Function

The primary purpose of this command to save some of CONGEN's data structures on file in unformatted form. In addition, the coordinate and internal coordinate data structures can be written in formatted form so that they be edited independent of CONGEN using GNU Emacs or a similar text editor. The option, FILE, specifies that a file is to be written in unformatted form (binary). The option, CARD, specifies that a file is to written in formatted form. For the coordinate and internal coordinate file, CARD is the default.

A set of title lines must follow the WRITE command. This title will be written at the start of the file and serves to document the file. For your protection, one should always make good use of this title, as it may be the only documentation for the file.

The UNIT keyword specifies what Fortran unit the output should be written to. It cannot be omitted.

Additional options are available for writing coordinates in text format. The option, KONN, will write the coordinates in Konnert-Hendrickson format. The synonymous options, BROOKHAVEN and BRKHVN, will write the coordinates in Brookhaven Protein Data Bank format. The option pair, HNUSE and NOHNUSE, control whether the hydrogen on the peptide nitrogen is written with a name of `HN' or `H. The default is NOHNUSE which uses `H'. The option pair, WRAP and NOWRAP, controls whether hydrogens which have a terminating digit are written with the terminating digit first. For example, the arginine atom, HH12, is written as `2HH1' if WRAP is enabled, and written as `HH12' if NOWRAP is enabled. The default is NOWRAP.

PRINT -- Writes Information to Output File (Unit 6)

Syntax

PRINt { PSF                  }
      { RTF                  }
      { CONStraint           }
      { PARAmeter            }
      { RESIdue              }
      { COORdinate coor-spec }
      { IC                   }
      { HBONd       [ ANAL ] }
      { IMAGes               }
      { NMR nmr-options      }
      { FROM unit-number     }

coor-spec::= { [MAIN] } [ OFFS int ] atom-selection
             {  COMP  }
             {  DIFF  }

nmr-options ::= [ALL ]
                [NONE]

                [[NO]NOE] [[NO]JCOUPLING] [[NO]TABLE] [[NO]SORT] [TOP int]

                [INDIVIDUAL int] [[NO]ENERGY] [[NO]FORCE] [[NO]VIOLATION]

                [MIN real] [MAX real] [ROWS int] [COLUMNS int]















Syntactic ordering: All commands must be typed in the order shown except for the nmr-options which can be in any order after the NMR option.

Function

This command is used to list information contained in data structures used by the program or to list a formatted file. The information must already have been created through use of a READ, GENERATE, HBONDS, etc., command. The printable output is sent to unit 6.

If the FROM option is used, the PRINT command will print a formatted file onto unit 6. The file will be rewound after printing so it may be used again.

For hydrogen bonds, ANAL gives a geometrical and energy analysis of the hydrogen bonds. Representing the hydrogen bond as A2-A1-X-H....Y-, the distances X-Y, H-Y, the angle (180 - <(X-H-Y) ), the dihedral angle A2-A1-X-H and the hydrogen bond energy contribution are listed.

The PRINT NMR command invokes an analysis of the NMR constraints. There a number of components in the analysis, and The various nmr-options control which components appear. If no options are specified then all components will be displayed. However, if there are any options, then only those specified by the user will be displayed. The keyword, ALL, may be used to turn on the display of all components, and then the user may modify the display with additional operands.

The keywords are interpreted as follows:

ALL
This enables the display of all analysis components.

NONE
This disables the display of all analysis components.

NOE
NONOE
The NOE option enables the display of all components which pertain to NOE's. NONOE disables this display.

JCOUPLING
NOJCOUPLING
The JCOUPLING option enables the display of all components which pertain to J coupling constraints. NOJCOUPLING disables this display.

TABLE
NOTABLE
This enables the display of a table of either NOE's or J coupling constraints possibly sorted by decreasing energy. NOTABLE disables this display.

SORT
NOSORT
The SORT keyword controls whether the above tables are sorted.

TOP
The TOP option specifies how many entries are to be printed in the tables of constraints. The default is 20.

INDIVIDUAL
In the analysis of NOE constraints, CONGEN can print many details about the NOE's. The INDIVIDUAL option controls approximately how many are output. The program converts the value of this option into a frequency of output starting with the first NOE. In addition, a brief summary of NOE satisfaction is listed. If this option is specified as zero or negative, then no individual NOE output or summary is made.

ENERGY
NOENERGY
The ENERGY and NOENERGY options controls the display of a histogram of non-zero energies for the NOE or J coupling constraints. The form of the histogram can be controlled by other options below.

FORCE
NOFORCE
The FORCE and NOFORCE options controls the display of a histogram of non-zero forces for the NOE constraints. No histogram for J coupling forces is currently available. The form of the histogram can be controlled by other options below.

VIOLATION
NOVIOLATION
The ENERGY and NOENERGY options controls the display of a histogram of violations for the NOE or J coupling constraints. The form of the histogram can be controlled by other options below.

MIN
The MIN option specifies the minimum in the data used by any histogram. If omitted, then the histograms will use the minimum of whatever data is being plotted.

MAX
The MAX option specifies the maximum in the data used by any histogram. If omitted, then the histograms will use the maximum of whatever data is being plotted.

ROWS
The ROWS option specifies the number of rows (buckets) in the histogram. The default is 18.

COLUMNS
The COLUMNS option specifies the number of columns in the histogram. The default is 80 characters.

Open File Command -- OPEN

Syntax

OPEN UNIT integer NAME filename [WRITE] [UNFORMatted]
                                [READ]  [FILE]
                                        [FORMatted]
                                        [CARD]

Function

The OPEN command is used to open logical units to specific files specified from the input file rather than logical name assignments made prior to the run. This is useful in setting up test cases and interactive use of the program. OPEN can be used to redirect the output that appears on unit 6 to different files by opening unit 6 in the middle of a run. However, it may not be possible to restore unit 6 back on some machines, so be careful with this.

Close File Command -- CLOSE

Syntax

CLOSe  UNIT integer  [SAVE  ]
                     [KEEP  ]
                     [DELETE]
                     [PRINT ]

Function

The CLOSE command closes a logical unit. This frees the associated file and logical unit so that they can be used for other purposes. The default disposition of the file is SAVE or KEEP.

REWIND Command

Syntax

REWInd  UNIT integer

Function

The REWIND command causes the requested logical unit to be rewound. When used with the STREAM command, a particular sequence can be used more than once.

STREAM Command

Syntax

STREam  UNIT integer

Function

The STREAM command allows the input of command sequence to be shifted to another file. This is useful when parts of an input file are to be used many times or used by many different calculations. The only input value is the unit number to transfer to.

RETURN (Restore Previous Command Stream) Command

Syntax

RETUrn

Function

The RETURN command causes the input of command sequence to return to the stream that called the current stream. Streams may be nested to up to 20 calls. There are no parameters for this command.

Go to the previous, next section.