Go to the previous, next section.
There are a number of programs and procedures available which assist in the use of CONGEN and which aid its development. We can divide these programs into three categories: programs which aid in the use of CONGEN, tools used in developing the program, and general programs for the manipulation of data. The tools used for program development are described in section The Implementation of CONGEN, and the other programs are described below. All of these programs are stored in the support directory. `CGP'. In the normal CONGEN setup, any user may execute these commands by simply typing their names. See section Installation of CONGEN on UNIX, and section Installation of CONGEN on VMS, for more details about the setup.
A number of utilities exist to support the use of CONGEN as well as to create data files used by the program. There are also several programs for the display of space filling pictures of molecules. The interactive display program, peer, is described below, and the static display programs are described in section Displaying Pictures.
The cmploop command provides a simple method to analyze CONGEN conformation file. The syntax is
cmploop [/b] [input-file [output-file]]
If no input-file is specified, the program defaults to `congen.crd', and if no output-file is specified, then standard output on Unix machines or `SYS$OUTPUT' on VMS machines is used.
The program will read the CONGEN conformation file specified as the input-file and it will compute the RMS difference between each conformation in the file and reference coordinate set stored in the beginning, see section Conformations File, for more information. The RMS difference is computed only over those atoms which are defined in both coordinates and which are different. If there are no such atoms, as might be the case when the reference coordinate set is empty, then the RMS is reported at 9.999. The /b switch directs the program to use just backbone atoms; N, C, and CA.
cmploop also reports the energy of each conformation as evaluated by the last EVL degree of freedom used in the search, see section Evaluate Degree of Freedom. It will also display the torsion angles for all backbone degrees of freedom used.
Frequently, one is interested in the conformations with the lowest energy or RMS deviations. See section sortn -- Sort a Text File by Numbers (Real or Integer), for a command that will sort the output of cmploop by these values.
Since cmploop computes RMS deviations over only those atoms which have different coordinates, it can report different RMS deviations than a COOR RMS command using the same atoms, see section The Coordinate Manipulation Commands.
The comparecg command provides a mechanism for comparing two CONGEN conformation files which may have different orders of conformations. It is very useful when comparing results from two CONGEN runs when parallel processing has been enabled. The command takes the two conformation files, generates output from cmploop, see section cmploop -- Preliminary Analysis of CGEN Files, removes the ordinal numbers and summaries from the listing, sorts the files by the energy values, and runs a difference program on the results. This program is available only on Unix machines.
The syntax is
comparecg [-g] [-f font] [-b|-i|-w|-W|-D] [-N name] cg-file1 cg-file2
The -g option is used to specify which difference program to use. When missing, diff is used. Otherwise, gdiff is used. All the other options are passed to the difference program you select.
The comparecmp is identical to the comparecg command, see section comparecg -- Compare Two Conformation Files, except that the input files are the outputs of the cmploop program.
The program, brkchm, can be used to convert a Brookhaven data bank file to Congen format. The program will also generate a Congen input file to construct the PSF including disulphides. The program is interactive and reasonably self-explanatory. However, the program is not very robust, and it will often be necessary to edit the files it produces in order to get the structure built.
The program, homology, is a simple sequence homology program which uses the homology finding code in CONGEN, see section Matching of the Comparison Data Structures. In order to use the program, you must prepare two sequence files as described in section Specifying a Sequence of Residues for a Segment, which will be the first requests of the program. Then, the program will ask for conserved residue sets, and you must respond with a set of lines giving residues which are to be considered equivalent. A blank line terminates this section. The program will then compute and report the homology and repeat the questions. You can terminate the program by specifying a blank file name for the first sequence file.
peer is a simple graphics program for "peering" into a set of molecule. It can also be used to manipulate the molecules orientation with respect to one another in a very crude way.
The fundamental operation of the program is to display the molecule using either vector or space filling representations. Transparent spheres are also supported. You can move or rotate the view at will using the dials or the keyboard. The program's input file specifies the molecules by their atoms and bonding connections. Up to 32 molecules can be read in. These molecules can be selected using the buttons, and when they are selected, they can be moved relative to the non-selected molecules.
Not all of these features of this program can be used on Personal Irises. In particular, anti-aliased vectors, depth-cued vectors, and transparent spheres are not supported on PI's.
The peer file is normally generated by the peer command in CONGEN, see section PEER Command. It is a text file and describes of a set of molecules, each of which is given in three parts. Each of the parts consists an initial line count which is on a line by itself followed that many lines. The first part of molecule is the color table. Only the color table of the first molecule is used. Next comes a list of atoms specifying the segment id, residue name, residue id, IUPAC atomic name, XYZ coordinates, radius, and color. Finally, a list of bond connections is given, one per line where each bond connection is specified by atom numbers. In principal, such a specification could be made by hand.
There are two major modes of operation for peer, one where no molecules are selected, the other when some are. The difference between these two modes is the intepretation of the dials as we shall describe below.
The dials are used to control either the entire view or to move individual molecules. When no molecules are selected, the dials have the following interpretation:
X rotation o o X translation Y rotation o o Y translation Z rotation o o Z translation Slab o o Scale
The rotations and translations apply to the entire view.
Slab and Z translation apply to clipping and display with regard to the front and rear of the viewing box. Slab controls the size of the viewing box using units of the vertical dimension of window. When this value is changed by the dials, it is displayed in the window title. The z translation specifies the position of the molecule within this viewing box. Its value is also displayed with this dial is changed.
You can make the molecule completely disappear by either shrinking the viewing box down too small, or by z translating the molecules out of the box.
When some molecules are selected, then the interpretation is similar except that Slab and Scale have no effect, the translations and rotations apply to the selected molecules only. The origin of rotation can be set with the set origin menu item.
The switch box is used to select molecules. When a molecule is selected, its color changes to the last color specified in the color table. Also, the light on the switch turns on. The switches act as toggles, so that successive operations of a switch select and then deselect the molecule.
The left mouse button is used for atom identification. Move the cursor to the center of any atom, and touch the left mouse button. All atoms fairly close to the cursor will be displayed in the text window as well as the title bar. Note: it is difficult to pick an atom when drawn as a sphere because the pick area is very small compared to the usual sphere size. It is usually helpful to switch to the vector drawing before picking. The center mouse button has its default action of window moving. The right mouse button brings up the pop-up menu.
In the event that dials and buttons are not available, keys on the keyboard may be used for those functions. The following set of keys are supported:
The right mouse button brings up a large menu of options for controlling peer. In the commands which specify how a molecule is drawn, the behavior of peer depends on whether any objects are selected. If no objects are selected, then all molecules are drawn in the specified way. However, if any objects are selected, then only the selected objects are drawn according to the command. For example, if you wanted to draw a transparent space filling view of a molecule on top of a vector drawing of the same molecule, you would do the following:
The pop-up menu options are as follows:
G
)
in the sphere drawing commands, see section Adjustment of the Lighting and Shading.
The window where peer was invoked is used for this.
The command, peercg, runs a simple awk script to convert a peer file into a CONGEN formatted coordinate file. The usage is as follows:
peercg peer-file congen-coordinate-file
The construction of the backbone energy maps and the proline constructors is performed by a number of programs and user subroutines added to CONGEN, and is orchestrated by a makefile. The directory controlling the process is the `emap' directory under the `CGP' directory. To rebuild all these files within this `emap' directory, change your default directory to that one, and issue either a make all command on UNIX or an mms all command on VMS. To install new copies of these files in `CGDATA', use a target of install for the make or mms.
The pieces of the construction process are described briefly below:
PDM88 is a program written by Donald E. Williams which is used in the calculation of charges. See section GAUSSIAN Command -- Invoke Gaussian Program, for bibliographic references and for complete information on the interface from CONGEN to this program.
This version of the program is very similar to the program published in QCPE. It has been modified to detect I/O errors so that it will fail cleanly if the Gaussian job fails.
PDGRID is a program written by Donald E. Williams which is used in the calculation of charges. See section GAUSSIAN Command -- Invoke Gaussian Program, for bibliographic references and for complete information on the interface to this program.
This version of the program is very similar to the program published in QCPE. It has been modified to accept an external specification of radii, and to detect I/O errors.
There are a number of commands which manipulate data in a general way. Similar tools are available on most Unix systems, but most VMS systems lack them, and there are subtleties that make some of these programs more useful than similar commands.
sortn and bigsortn sort files based on numbers found in each line in the file. The operation is specified through the command line, or if blank, the operations are specified interactively. sortn will sort a file containing no more than 25000 lines each containing no more than 133 characters. For bigsortn, the limits are 100000 lines.
The command line syntax is
sortn [/r] input-file output-file col1 size1 [col2 size2 ...]
coln are the positions of data, sizen is the number of characters for each data entry. The output file is sorted based on these keys after they have been converted to numbers. If the fields cannot be converted, then zero is used instead. If input-file is specified as a -, then standard input is used. If output-file is specified as a -, then standard output is used. The flag, /r, specifies that the sorting should be reversed, i.e. biggest first.
extract extracts columns of data from a file and places them in another file. This program is similar to the Unix program, cut, and was primarily written to provide that functionality for VMS. The operation is specified through the command line, or if blank, the operations are specified interactively.
The command line syntax is
extract input-file output-file col1 size1 [col2 size2 ...]
coln are the positions of data, sizen is the number of characters for each data entry. The output file consists of the columns concatenated together in the order specified with no blanks in between. If input-file is specified as -, then standard input is used. If output-file is specified as -, then standard output is used.
histo computes a histogram from the data in a file. The command line syntax is
histo input-file
If input-file is specified as a -, then standard-input will be used. Every word in the file which can be interpreted as a number is used.
As the program executes, you will be asked for the dimensions of the graph. The histogram is sent to standard output, and is designed to printed or displayed on a character device. histo uses the same code as the HISTO command in the Analysis facility, see section HISTO Command -- Print a Histogram.
scat computes a scatter plot from the data in a file. The command line syntax is
scat input-file
If input-file is specified as -, then standard input will be used. The data is free form, mixed with non-numeric input. Each line containing two numbers is used with the first two numbers going into the plot. scat uses the same code as the 2DPLOT command in the Analysis Facility, see section 2DPLOT Command -- Make a Scatter Plot.
numdiff is used to compare two text files which contain numbers. The files must have the same number of lines, and must have the same structure for this command to work. The command line syntax is
numdiff file1 file2
Neither file may be specified as a hyphen (-) -- explicit names must be used. numdiff reads a line from each file, translates all punctuation to spaces, and then read each word on the line. An attempt is made to convert each word to a floating point number, and the minimum of the absolute and relative difference in magnitude is computed. If this number is larger than 0.01, then a message is printed and the two offending lines are displayed, otherwise, the difference is used to calculate the maximum difference over the entire pair of files. Large differences are not used in the calculation of the maximum, so differences in dates and CPU times will usually not affect the results. Messages are printed if numbers are not found in equivalent positions or if the number of words in a line do not match.
ndiffpost is an alternate program for comparing two text files which contain numbers. To use ndiffpost, you first use the diff program to compare the files and then use ndiffpost to calculate numerical differences. The command line syntax is
ndiffpost [cutoff=real] [input-file]
ndiffpost goes through all of the difference blocks and reports the maximum absolute or relative differences found in each block. It is very handy for comparing the results of a program after a small change is made to a numerical calculation.
The program expects the input-file to be the output of an ordinary diff command. Do not use any other options (such as -c) with the diff program. The cutoff parameter specifies any block which has a maximum difference smaller than the \f3cutoff\f1 will not be printed. It is useful for reducing the clutter in an output file.
A example command execution would be
diff file1 file2 | ndiffpost cutoff=1.0e-5
This program is not available on any machine where awk is not provided. Most VAX/VMS machines do not have it.
Go to the previous, next section.