Go to the previous, next section.
There are several commands available in the CONGEN program to assist in a conformational search. The SPLICE command can be used to change the sequence of a structure to greatly simplify the modeling of mutations. The XCONF command is used to read a single conformation from a conformation file. The MERGE CG command is used to manipulate conformation files; either to merge them, and also to divide and recombine them.
SPLIce [NBXMod int] { segid resid resid repeat(res resid) } { RESEquence segid resid resid repeat(res) }
The SPLICE command replaces one set of residues with another set of residues. The coordinate set is shuffled to preserve backbone positions to the extent possible. Insertions of residues can be performed by splicing multiple residues in place of one residue. In addition, the residue topology file used to construct the PSF must be present when this command is used because this command regenerates the PSF. A corollary of this is that the SPLICE command can only be used when the entire PSF was generated from a single RTF. Also, any editing on the PSF, (see section EDIT -- Edits the PSF and Hydrogen Bonds)), will be lost. More details on the implementation are given below.
The residues to be replaced are specified by a segment identifier followed by two residue identifiers which start and end the changed segment. If only one residue is to be changed, then both residue identifiers will be the same.
The specification of replacement residues depends on the presence of the RESEQUENCE option. If RESEQUENCE is not specified, then you must specify both the residue names and the residue identifiers for all replaced residues. The advantage of this approach is that residue identifiers outside of the changes are consistent. If RESEQUENCE is specified, then you must specify only the new residue names. All residues within the segment will be renumbered from the first residue.
The shuffling of coordinates is an important feature of this command. During the resequencing operation, all backbone positions (atoms not constructed by the sidechain degree of freedom operator) are conserved for residues that are in the same position relative to the N-terminal side of the splice. Thus, for an equal length change, the backbone will be preserved. For insertions or deletions, then the C-terminal side of the splice will be wrong; although the anchor for a CONGEN search (atoms CA, C, and O) will be preserved. If a splice results in no change in the residue name, then all the sidechain atoms are copied. In addition, when a glycine is changed to any other residue, the position of CB is constructed assuming perfect geometry for a amino-acid as specified in the parameter set.
The actual implementation of the command is to use the equivalent of the GENERATE command (see section The Generate Command - Construct a Segment of the PSF) after modification of the sequence. The NBXMOD option controls the automatic generation of non-bonded exclusions when the GENERATE command is simulated. See section NBXMOD -- Automatic Generation of Non-bonded Exclusions, for more details. The first operation performed after the SPLICE command has been successfully parsed is to shuffle coordinates, and to record the position of all disulphides. Then, the PSF is erased and completely regenerated using the new sequence. Disulphides are now added back, and the coordinates for the backbone and endpoints are corrected. The internal coordinates are then cleared, so that a new set of construction rules can be generated, see section The Internal Coordinate Commands. Finally, the hydrogen bond list, non-bonded list, fixed constraints, harmonic atom constraints, and dihedral angle constraints are all cleared. Any automatic generation options, see section The Generate Command - Construct a Segment of the PSF, are also applied.
XCONf {[UNIT] unit} { [ BEST ] } [ESURF real] { [ WORSt] int } { [NUMBer] }
The XCONF command reads a conformation from a conformation file, see section Conformations File. The conformation can be specified by number, by evaluation ranking, or by accessible surface ranking. Accessible surface ranking was the method used in the paper, R. E. Bruccoleri, E. Haber, J. Novotny, "Structure of Antibody Hypervariable Loops Reproduced by a Conformational Search Algorithm", Nature 335, 564-568 (1988); Nature 336, 1266 (1988).
The three possible ways of selecting conformations from the file are controlled by the BEST, WORST, NUMBER, ESURF options. If NUMBER is specified, or if neither BEST. WORST, nor NUMBER is specified, then the integer is used as the sequential conformation number in the file. This number may be specified as 0, which means that the reference coordinate set in the conformation file should be used (see section Conformations File, for a description). If BEST or WORST is specified and ESURF is not specified, then the integer which follows BEST or WORST gives the rank ordering in evaluation terms. BEST directs the program to use the ordering with the lowest energy first, and WORST directs the program to use the highest energy first. If BEST or WORST is specified and ESURF is specified, then the accessible surfaces of the conformations within ESURF of the minimum will be calculated (regardless of the settings of BEST or WORST, and the selection will be based on the rank order of the accessible surfaces of this subset. This latter selection is controlled by the BEST or WORST keywords. The surfaces are calculated in the context of the entire system.
The conformation file is specified by UNIT operand. unit number. Only those atoms which were constructed in the search are read.
N.B. Make sure that the coordinates of the atoms outside of the constructed atoms are exactly the same as they were when CONGEN was run. Otherwise, the connections between the constructed atoms and their surroundings will be distorted. Also, the program will die if the conformation number is too large.
In these examples, let's assume that the evaluation used in the conformational search was ENERGY.
To select the ninth conformation from a conformation file on unit 60:
XCONF UNIT 60 NUMBER 9
To select the second lowest energy conformation from the conformation file on unit 50:
XCONF UNIT 50 BEST 2
To select the conformation which has the lowest surface area among the conformations within 3 kcal/mole of the lowest energy conformations in the file on unit 43:
XCONF UNIT 43 BEST 1 ESURF 3
MERGE CG repeat(IN unit [[int]:[int]]) OUT unit [ {MAIN} ] [SELEct atom-selection END] [TITLE] [ FIRST {COMP} ] [ {unit} ]
If the TITLE option is used, then a title must follow this command.
The MERGE CG command reads a set of conformation files and writes a selected subset of the conformations to a single file. It is possible to change the initial title of the file, the reference coordinate set, and the atoms stored. See section Conformations File, for details on the contents of the file.
The conformation file to be read are specified using the IN operands. Each IN operand consists of the Fortran unit where the file is to be read followed by an optional specification of the range of conformations to be read. The specification gives the starting and final conformation number. If either number is omitted, the program assumes either the beginning or the end of conformation file, as appropriate.
The conformation file to be written is specified by its Fortran unit number using the keyword OUT. You can change the opening title in the file by specifying the TITLE keyword, and following the command by a title.
The reference coordinate set for the output conformation file can be changed by using the FIRST option. You can change it to the current coordinate set (use MAIN keyword), the comparison coordinate set (use COMP keyword), or the reference coordinate set from one of the input files (use its unit number after FIRST).
By using the atom selection, one can eliminate atoms from the conformations. Only the selected atoms are written to the merged file. By default, all atoms are selected.
Go to the previous, next section.