Go to the previous, next section.
This facility provides most of CONGEN's capability for analyzing structures and calculations done on them. If you've never used the analysis facility, see the introduction below. If there are terms you do not understand, consult the glossaries, section Glossary of Syntactic Terms, and section General Glossary. The command notation uses the same meta syntax as described for all other free field CONGEN commands, see section Rules for Describing the Syntax (The Meta-Syntax).
CONGEN provides a facility for analyzing the results of any calculation made as well as comparing one's results to any other calculation. This facility is general in the sense that it will work with arbitrary residues, any set of parameters, and will permit a broad range of comparisons. It provides all the features of Bruce Gelin's ANC program plus many others.
There are several important aspects of the design of this facility which are important to its users. First, the facility provides a small number of simple commands which can be combined to do a variety of tasks. Secondly, the program is well adapted is the job of comparing results, regardless of whether the results were obtained on the same system or on a homologous one. This will permit previously impossible comparative studies to be performed -- such as comparing the dynamics of hemoglobin and myoglobin in homologous regions or comparing the results obtained from the explicit hydrogen or extended atom models.
These two design considerations dictate a great deal about the analysis facility's operation. The first consideration, being able to combine commands, require that the facility store the results of one operation so that it can be used in another. There are two data structures (see section General Glossary) that the facility uses to store such results, and they are important to understand.
The major data structure is the table. In analyzing the large amounts of data inherent in a macromolecule, we need a method for organizing it. Consider, for example, the 631 bond angles in bovine pancreatic trypsin inhibitor. Without a good ordering of these angles, it would difficult for a person to see any relationships in these angles. However, since a structure in CONGEN consists of a number of segments which, in turn, consist of a number of residues which a number of atoms or internal coordinates, we can organize the data along these lines.
Therefore, a table contains a list of segments which are identified by their segment identifiers as specified in the GENERATE command. Each segment contains a list of residues. The residues are named (GLY, ALA, etc.) and have identifiers as well. The identifiers are the character form of the sequence number of residue. Each residue in the table contains a list of data arrays where every array is "tagged". "Tagged" means that each array point has associated with it a character string which serves to identify it. The tags are easily constructed. For example, the tag for a bond is the IUPAC name for each atom in the bond separated by a dash. Each element of the array contains a property of the atom or of the internal coordinates. For example, the minimum energy and average length of bonds during a dynamics run are properties of bonds. The table also contains a title which identifies the entire table and is printed along with it.
Many operations can be performed on these tables. First, the BUILD command will generate a table. Currently, there are dozens of different tables which can be generated. Tables can be printed in several different ways using the PRINT command. Simple statistical information can be added to them using the ADD command. The DELETE command may be used to delete data from them so that one more easily study a subset of the data. Finally, the SELECT command may be used to select data from a table and record the results in the second major data structure, the selection.
The selection is another data structure which is a collection of data which is less organized than the table. The selection consists an array of numbers where every number has associated with it its position in the table as well as the residue to which it belonged. Two selections are provided in the analysis facility, and the following operations are supported: First, data may be selected from the table using SELECT command. Second, a histogram of the selected data may be made using the HISTO command. Third, using the PLOT command, the data in the selection may be plotted against its position on the table or against the residue number of the residue to which each data point belongs. Finally, the two selections may be plotted together using the 2DPLOT command to yield a scatter plot. We can therefore make a scatter plot of any two sets of numbers; we are not limited to phi-psi plots.
In addition to analyzing the static properties of the structure which are maintained in CONGEN, the analysis facility can analyze the results from a dynamics calculation. Properties of the internal coordinates and atoms which are fairly easy to calculate can be built into a table. The ACCUM and COMBINE commands are used for preparing the data for inclusion into a table. Correlation functions may also be calculated using the CORREL command.
Complementing the above commands, there are commands which perform more isolated functions. The analysis facility has READ and WRITE commands for reading and writing data structures which are peculiar to it. There is a close contact search command, SEARCH, which searches for close contacts of atoms to other atoms or to spatial positions. There is a DRAW command which prepares input to the PLT2 plotting program, and MOLD, a molecule drawing program. The SET command may be used to change I/O units and the size of the page. Finally, there is a command, DELIM, which changes the command delimiter.
To call the analysis facility of CONGEN, one places ANALYSIS as the first and only word on a command line (it may be abbreviated to four characters). Prior to calling the analysis facility, the user must have taken the following steps: A PSF must have been read or constructed. A parameter set and complete coordinate set are required also. If hydrogen coordinates are missing from the coordinate set or if hydrogen bonds are to be analyzed, a HBONDS command must have specified before the ANALYSIS command (see section Generation of Hydrogen Bonds). Likewise, if non-bonded interactions are to be analyzed, an NBONDS command must be invoked prior to analysis (see section Generation of Non-bonded Interactions). If the total energy per atom is to be analyzed, then both NBONDS and HBONDS must be specified.
Although the analysis facility has a command interpreter which is different from that in the main program, commands are specified in free field format, as in the main part of CONGEN, see section Controlling a CONGEN Run. To exit the analysis facility, use the END command, which is just the word, END, on a command line by itself.
The analysis facility can generate many messages, warnings, and errors. The messages are self-explanatory. The warnings generally tend to be self-explanatory, but there are some warnings that have to do with the program logic which are obscure. Likewise, errors that relate to bad sets of user commands will give understandable messages; internal logic errors give incomprehensible messages. If you get an error message you do not understand, please see Bob Bruccoleri or mail to `bruc@bms.com'. Chances are that you have found a bug, and I should always be told about that.
Go to the previous, next section.