Cooperative Interactions of the Gene 5 Protein

Using the refined molecular structure of the Gene 5 DNA Binding Protein (G5BP) and the mechanism of DNA binding deduced from a variety of experimental techniques (G. D. Brayer and A. McPherson, J. Mol. Bioi. 169, 565, 1983; G. D. Brayer and A. McPherson, Biochemistry 23, 340, 1984), we have modeled the contiguous, linear aggregation of G5BP dimers along two opposing single strands of DNA. Using both automated graphics systems and systematic calculations of intermolecular contacts between adjacent units, we have optimized the fit of complementary protein surfaces in the presence of DNA. We propose that a minor conformational change involving residues 3843, triggered by the binding of nucleic acid, relieves several critical steric contacts and permits otherwise extensively com· plementary surfaces to form an interface. The bonding between surfaces on adjacent G5BP units is the primary source of the cooperativity of binding observed for G5BP. The interacting amino acid residues at the interface are described.


Introduction
The gene 5 product of bacteriophage fd is elaborated late in the infection cycle to produce termination of DNA replication (3,4). It is a small DNA binding protein of 10,000 daltons molecular weight whose sequence is known (5,6). The protein exists in solution as a dimer (7) and the two identical subunits are closely interlocked and related by a dyad axis of symmetry. As illustrated by the drawing in Figure 1, it binds preferentially to single stranded regions of normally duplex native DNA to induce progressive separation of the strands. Alternatively, it crosslinks opposite sides of a circular DNA single strand to produce a long helical filament of DNA and protein (3,8,9).
A distinctive characteristic of the gene 5 DNA unwinding protein is that it binds in a highly cooperative manner to linear strands of DNA (3,7). Individual gene 5 protein molecules associate in a contiguous fashion along the nucleic acid chain as opposed to interspersing themselves at random along its length. Binding of one gene 5 protein molecule to the strand of DNA influences others to bind immediately adjacent to it rather than at some distance away. This is presumed to be a consequence of favorable protein-protein interactions between adjacent gene 5 molecules. The structural mechanism for these attractive interactions must be assumed absent or inoperative when no DNA is at hand, for the protein does not spontaneously aggregate in solution even at high concentrations but exists only as a discrete dimer specie. Thus linear strands of DNA must play a direct role in the cooperative phenomena, and indeed it appears that even short fragments of DNA, such as oligonucleotides, may also promote aggregate formation ( 10,11 ).
We have determined the structure of the native gene 5 molecule in the crystalline state and have arrived at a refined and detailed model of the protein in which we can accurately place all of the amino acid residues (1). A drawing of the polypeptide backbone using only a-carbon atoms is shown in Figure 2. We have, in addition, deduced the DNA-protein contact area and analyzed its structural properties (2). Given the refined protein structure, the protein-DNA contact area, the helical parameters that describe the gross features of the gene 5-phage DNA complex and which are known from other studies, and using a computer based search and evaluation of sterically acceptable atomic contacts at subunit interfaces for all possible orientations, we have produced the model of the helical nucleoprotein complex seen in Figure 3. The contact analytical procedure is summarized below and is described more completely in McPherson and Brayer ( 12) as are the details of the helical model. The native refined structure was employed throughout the analysis of helical possibilities and no conformation changes were invoked in order to achieve good fit at the interfaces. As described later, however, one relatively minor change in the region of tyrosine 41 does produce an improved fit and is, as we describe, likely to be the key to the cooperative aggregation phenomenon. In  this model we have implicitly identified the interfaces that relate the adjacent gene 5 molecules when bound to DNA as well as the disposition of the DNA actually spanning the junction. From an examination of the contact area using computer graphics systems, actual Kendrew models, and a quantitative calculation of allowed and prohibited contacts we believe we can reasonably propose an underlying structural and mechanistic basis for the cooperative effect.

Procedures
The structure of the protein was solved to 2.3 A resolution using multiple isomorphous replacement techniques and refined using restrained least squares methods (13,14) to a final residual of 0.21. The details of the structure analysis are presented elsewhere (1) and will not be repeated here. Suffice it to say that the electron density maps of the refined structure resolved the positions of the amino acid residues that comprise the molecule and that little remained in doubt regarding orientation of side chains or tertiary and secondary interactions. Furthermore, an analysis of the thermal parameters indicated certain regions of the molecule likely to be conformationally permissive as well as those likely to be rigid and structurally stable. One of these more mobile regions may, we believe, play a significant role in the cooperative interactions. From the electron density map we constructed a Kendrew wire model of the gene 5 protein in a conventional Richards optical comparator and adjusted the model as dictated by the course of refinement to assure accuracy and maintain consistency. Hence, the final Kendrew model represented the refined coordinates as best they could by implimented. We also constructed the relevant portions of a second gene 5 monomer within the dyad related pair, so that we could examine the protein as it truly exists, as a dimer. We eventually modeled the DNA single strand into the nucleic acid binding site of the molecule so that the DNA actually spanning the junction between gene 5 protein molecules contiguously bound along the polynucleotide chain could be examined. This modeling procedure was not based on direct visualization of the proteinnucleic acid complex, but relied on stereochemical considerations, some difference Fourier syntheses, and experimental data from other techniques as described in Brayer and McPherson (2). Any conformational alterations suggested by the modeling or by other data were also applied to the Kendrew model and these alternatives idealized to maintain compatibility with the remainder of the molecule.
A careful examination of the areas under study was made on an Evans and Sutherland automated graphics system in the laboratory of J. Kraut at UC-San Diego. This was confined to examining images composed only of the atomic bonds. The structure of the interface regions using true atomic radii were further examined and analyzed using an Advanced Electronics Design Model 767 color raster graphics graphics system interfaced to a PDP 11/34 in our own laboratory.
Although extremely useful in guiding the modeling process, graphics systems are essentially qualitative in nature and do not always reveal otherwise serious and irreconcilable atomic interactions between the various macromolecular components.
To obviate this problem, for all contiguous juxtapositions of gene 5 molecules along the DNA chain, distances between all nonbonded atoms were computed and violations of van der Waal contacts declared. At the same time, the degree of complementarity and the extent of favorable interactions was defined by the number of stereochemically acceptable, close contacts between atoms on different macromolecules. The approach we employed and the assumptions, although different in specific detail, were not appreciably different than the analysis carried out by Leventhal et al. (15) to model the interactions between hemoglobin molecules that promote formation of helical fibers in sickle cell anemia. The details of this atomic contact analysis will be presented elsewhere, and it is sufficient here to say that the prohibited atomic contacts were dominant in delineating plausible models, while the nature and number of acceptable, favorable contacts were useful in defining the probability of a model. Calculations were carried out for approximately 18,000 surface contact possibilities over a fine grid of the parameters required to specify the helix. These variables were (i) the radius of the helix, (ii) the repeat distance along the helix or pitch, (iii) the number of gene 5 dimers per turn of the helix, and (iv) the orientation of the gene 5 dimer as determined by its rotation about the intermolecular dyad axis.

Rationale and Analysis
The degree of cooperativity shown by gene 5 molecules toward single stranded DNA may be defined in a simple manner as the difference in likelihood of a gene 5 molecule binding to a site on the DNA immediately adjacent to, and presumably in contact with, a gene 5 molecule already bound, as opposed to its binding to an isolated site on the DNA where it lacks any protein-protein contact. From a variety of equilibrium measurements the likelihood of contiguous protein binding is between 60 and 1000 fold higher than for isolated site binding (3,7). Thus the association constant for gene 5 to a tetranucleotide such as d(pA) 4 , which might be expected to roughly fill the protein-DNA binding site is 0.1-3.0 x lQ+D, while the equivalent association constant for an octanucleotide, d(pA)s, is 5 x 10+8, and for the single stranded phage DNA it is lQH (8).
The source of the cooperativity may reside entirely in interactions that occur between the adjacent protein molecules along the DNA strand, they may arise from a change in the overall protein-DNA interaction that occurs upon contiguous versus isolated protein binding, or the cooperativity may derive from some combination of the two.
The gene 5 protein exists in solution predominantly as a dimeric specie and does not apparently form higher aggregates (16,7) even at high concentrations. Indeed it would seem unreasonable for the protein molecules to expose, in their native state, a fixed surface strongly attracted to available complements on other molecules. Such units would polymerize to form long helical arrays, thereby invalidating their physiological function, and such arrays have not in fact been observed. In order for an attractive surface to be absent in the unliganded state and present in the liganded state, i.e., available for interfacing to an adjacent gene 5 surface, then one of three possibilities would seem to be likely. These are: (a) DNA binding produces a major conformational change in the protein and these structural alterations result in the creation or assembly of a completely new surface area which possesses the attractive or adhesive characteristic. That is, amino acid side chains are realligned so that a structural basis for cooperative interaction is created.
(b) A latent binding surface always exists in the protein but is masked by other structural elements. Upon DNA binding a major conformational change occurs that removes the structure obscuring the cooperative binding surface. An analogy is the removal of backing paper from an adhesive label.
(c) The cooperative binding surface, the adhesive face of the protein, is always present in both the free and the liganded molecule. There may exist, however, an obstruction or blocking group that prevents two otherwise complementary surfaces from properly meshing and forming the network of favorable interactions that weld the two together. An analogy here is a key that is perfect in every other way, but cannot release the lock because it has one additional, albeit minute, tooth. Just as removal of the offending tooth allows two now perfectly complementary surfaces to complex, removal of a single obstructing group from a salient position on the otherwise adhesive protein surface would allow cooperative binding of two adjacent gene 5 molecules.
Both the first and second mechanism would require a substantial conformational change in the protein to either create in one case, or expose in the other, a cooperative surface. For this reason we believe both these possibilities to be unlikely. The physical and chemical studies that have been conducted, using a variety of techniques, substantiate the view that there is no appreciable conformational change in the protein upon binding of oligonucleotides, only subtle changes and these affecting specific residues. Day has shown, for example, that there are no changes in the CD or ORO spectrum of gene 5 protein upon complexation with DNA that would suggest any substantial change in secondary structure (17). NMR studies by Coleman et al. (18) on a-CH and on aliphatic methyl groups suggest as well that the gene 5 protein must contain a large percentage of fixed structure without extensive regions of flexible polypeptide chain. This is supported as well by examination of the structure of the protein dimer which shows it to be quite rigid and able to exhibit only rather limited dynamic effects (1). Analysis of the distribution and magnitude of thermal parameters in the molecule also indicate that, except for two specific polypeptide segments, the backbone is essentially immobile and rigid. We further note that in the only thoroughly studied case of a tight interaction between two complementary protein surfaces, that of trypsin with pancreatic trypsin inhibitor, very little conformational change was exhibited by either molecule over the contact area when compared to native (19).
The third possibility, although it too requires some molecular change, demands much less, and utilizes large and spatially fixed surfaces to express the complementary interactions. We believe that the cooperativity of binding is inherent in the native structure and that it is expressed by the repositioning of one or a very few obstructing side chains.
That the cooperativity which characterizes the protein-protein interactions between adjacent gene 5 molecules is a consequence of binding to DNA seems clear. Crosslinking with suberimidate in the presence of a series of different oligonucleotides followed by SDS-PAGE demonstrated the existence of higher aggregates containing at least eight monomers or four dimers (10). These aggregates were shown to be not present in the absence of nucleic acid. It has also been found that when gene 5 protein is combined with oligomers of four to eight in length, essentially irrespective of sequence, and subsequently crystallized, that these crystals invariably contain an aggregate of twelve gene 5 monomers as their asymmetric unit (11). These are in addition to the obvious and direct observation that in the presence of linear single stranded DNA contiguous aggregates appear and in its absence they do not. Thus any explanation of the cooperativity phenomenon must have a structural basis that accounts for the trigger effect of the DNA.

Results
The atomic contact analysis by which we delineated a model for the gene 5 protein-DNA helical complex, as well as the more qualitative investigations and trials using automated graphics systems, invariably optimized the fit of two adjacent gene 5 dimers by minor variations of a common set of contacts. That is, the greatest degrees of complementarity between contiguous gene 5 molecules always relied upon similar groups of nonbonded atomic contacts at the interfaces and, therefore, the few remaining prohibited contacts generally involved subsets of the same atoms. When the native structure of the gene 5 protein was used in this contact analysis, the minimum number of prohibited contacts that could be obtained by any close juxtaposition of two dimers was thirteen. When thirteen prohibited contacts were allowed, there was otherwise excellent complementarity between the interfacing surfaces as shown by the simultaneous presence of 135 allowed van der Waal contacts. As we will discuss below, this included a large number of potentially favorable atomic interactions.
As shown in Table I, the disallowed (less than van der Waal radius) contacts were confined almost entirely to the tip of the fJ loop formed by residues glutamic acid 40, tyrosine 41 and proline 42, and occasionally the flanking residues on one molecule, in apposition to residues 48-52 and residues 66-70 of the adjacent dimer. Now the polypeptide segments 48-52 and 66-70, as inspection of the native molecule will show, are rigidly fixed in the protein and participate in fJ structure formation.
Residues 40-42, however, comprise one of only two flexible and mobile segments in the molecule, the other being residues 16-30 that form the '"DNA binding loop". This is evidenced by the significantly higher than average thermal parameters as well as the scarcity of structural constraints on the tip of the loop (1). Thus we believe that the tip of this fJ loop, in particular residues 40-42, may be the crucial obstruction that must be removed and that otherwise prevents aggregation; that it may be of essential importance in regulating the cooperative polymerization that occurs upon DNA binding. It appears to occupy the necessary location and to have the potential for conformational variation that would be required to relieve the few prohibited contacts that otherwise prevent complementarity.
Tyrosine 41 is a residue that has been implicated by a wide variety of studies in direct interaction with bound DNA (2,8,20) and, therefore, it would be spatially sensitive to the presence of nucleic acid. In modeling the single strand of DNA to the binding site of the gene 5 protein, and based on considerations completely independent of the present analysis, we postulated that the tip of the fJ loop, and particularly tyrosine 41, does alter its conformation slightly in order to optimally bind DNA. This minor conformational change, discussed in more detail elsewhere (2), is shown in Figure 4. Indeed if we assume this minor alteration, which involves stacking of the tyrosine ring on a DNA base, then the atomic contact analysis shows the number of prohibited contacts to fall to zero and the number of acceptible van der Waals contacts to increase significantly.  Thus we propose that in the absence of DNA there exists a few sterically unacceptable atomic contacts that prevent otherwise complementary surfaces from merging and producing aggregate formation. In the process of binding nucleotides, the tip of the f3 loop alters its conformation very slightly in order to optimize interactions between tyrosine 41 and a base of the DNA. This in turn has the consequence of removing the few blocking interactions thereby permitting inherently favorable protein-protein interfaces to come into contact concomitant with self assembly into the helical structure. We envision the 40-42 loop to act as a two-position mechanical switch that allows or disallows cooperative protein interactions to take effect in response to the presence or absence of nucleic acid. Figure 5 is a drawing of two adjacent gene 5 protein monomers as they are juxtaposed in the model of the gene-S phage DNA helical complex seen in Figure 3 and deduced from the contact analysis described above. The DNA is in extended form and demonstrates the conformation conferred on it when fitted to the protein's binding site in accordance with the experimental data reviewed by Brayer and McPherson (2). As described there, the DNA was fitted to the protein surface in such a way as to obtain optimal charge neutralization, maintain stereochemical consistancy; and to utilize those interactions indicated by all other physical-chemical methods. These include optical and NMR spectroscopy, photo-crosslinking, and chemical modification. Figure  6 is a drawing of two consecutive and contiguous gene 5 dimers as they would appear when spanning two opposing single strands of DNA in the model helical complex. This same arrangement is shown in solid sphere form in Figure 7.
In addition to the favorable atomic interactions contributed by the complementary protein surfaces and triggered by the binding of DNA, we also have noted an interesting spatial interaction involving the DNA. The nucleotide lying at the extreme 3' end of the binding site is not tightly bound to the protein, though the base does stack upon the side group of tyrosine 41. It lies essentially in the gap or junction between two adjacent protein units. As seen in Figure 8, it occupies a cavity, created by protein, such that the top surface is contributed by one gene 5 unit and the bottom surface contributed by the other. It appears to us that the binding site for this nucleotide, in distinction to those for the remaining four, is of a composite nature and is only formed in its entirety when two gene 5 units bind contiguously.
There is an additional consequence of the formation of a binding site for one nucleotide between two adjacent monomers along the DNA strand. Although the stoichiometry of binding in this model is five nucleotides per gene 5 monomer, any given pentamer along the DNA strand actually comes in contact with a total of three different gene 5 monomers. In addition to the monomer that provides the majority of the binding groups, a phenylalanine 63' interaction arises from the second monomer within a dimer pair (see reference 2), that is, by the dyad related monomer binding chiefly the opposite strand, and the final interactions are supplied by the adjacent monomer along the same DNA strand. The interactions between contiguous gene 5 dimers along opposite DNA strands are, therefore, rather complicated. Through protein-protein contacts within a dimer or between Figure 5. In (a) is a stereo drawing of the a-carbon backbones of two adjacent and contiguous GSBP monomers arrayed along a single strand of DNA which traces the same path as in the helical complex shown in Figure 3. This corresponds to a radius of about 35 A. In (b) is a stereo drawing of the same two GSBP monomers with all atoms included but the DNA strand is absent.
contiguous dimers along strands, and through binding the two opposite strands of DNA, all monomers are woven together by an extensive network of intermolecular interactiOns. It would appear to us, therefore, that interpretation of data regarding aspects of DNA binding to gene 5 based on observations of gene 5 interactions with oligonucleotides of limited extent should be treated with some caution. As in all large biomolecular assemblies, there is likely to be an increasing degree of cooperativity as the size of the structure increases.
We would like to return to a further consideration of the binding interactions that occur between two gene 5 protein dimers following the minor conformation change principally affecting tyrosine 41. The two complementary surfaces on the gene 5 protein are rather limited in the number of amino acid residues involved and restricted to specific segments of the polypeptide backbone. The amino acid residues involved are shown in Table II. One surface is created by residues 64-70,79-87 and 50-53 and this surface is fairly rigid and unchanged by DNA binding. The second surface, to which we believe it is complementary, is constructed from residues between 1 and 11 and residues 40-43, the latter conformationally dependent as discussed above. Figure 9 shows the interface residues as they exist in the native structure when optimally juxtaposed and in the same orientation following the conformational change required to optimize DNA binding. One immediate observation is that two extensive hydrophobic areas have been brought together. On one molecule these are residues leucine 65, valine 84, valine 70, isoleucine 78, proline 85 and phenylalanine 68. On the other molecule are found isoleucine 2, proline 42, tyrosine 41, methionine 1 and alanine 11. Because the dimer state implies a doubling in number of all interactions, since both monomers of each pair  The insertion of one nucleotide into a pocket formed by the small gap between the two protein units is evident in this orientation. All atoms are represented as spheres corresponding to their relative van der Waal radius. The image was produced on an AED 767 color raster graphics system. The acidic and basic amino acid residues of the gene 5 protein are distributed more or less uniformly over the molecular surface and are not grouped to exhibit distinctive negative and positively charged regions that would produce "Coulombic docking" and predominantly electrostatic interfaces (21). In addition to the hydrophobic interactions, however, a number of individual salt bridges are possible between gene 5 molecules at the interfaces, and there are no prohibitive like charge pairs. The favorable interactions can only be presumed, since we have not observed them directly, and furthermore changes in the orientations of the amino acid side chains could alter the relations of the charges substantially either favorably or unfavorably. Based on calculated distances, however, at least five different salt bridges are possible and these are between (1) glu 40 and his 64', (2) lys 87 and either glu 5' or glu 40', (3) asp 79 and lys 7', (4) lys 69 and glu 40' or glu 5', and (5) arg 82 and asp 36'. None of these residues are otherwise engaged in either binding DNA or in other tertiary interactions although histidine 64 does appear to be associated with groups on the second monomer within its own pair in the native state. Figure 9. In (a) is a stereo drawing of the amino acid residues (1-11, 39-45) on one G5BP monomer and the residues (48-55, 63-87) on the second monomer if G5BP units are assembled to optimize the complementarity and fit of contiguous G5BP units along the DNA strands; that is, as in the model of the helix of Figure 3. When native G5BP molecules are used as in (a), there are six contacts made, chiefly involving tyrosine 41, that are considerably less than van der Waal distances and therefore prohibited. If, however, the small conformational change illustrated in Figure 4 is applied to optimize DNA binding, then the disposition of the residues at the interface between two contiguous G5BP monomers is as seen in (b). It is shown by comparison of these two drawings that the minor structural change produces relaxation of the steric hinderances and at the same time both preserves and enhances the degree of complementarity. No attempt has been made here to optimize interface interactions to further improve the fit, but a number of possibilities are both evident and probable.
There are also large numbers of possible hydrogen bonds between adjacent gene 5 dimers. We note that six hydrogen bound donors are within 3.8 A of equivalent acceptors per monomer or twelve per dimer. We have not, however, examined the detailed geometry of these groups to determine their degree of probability.

Conclusion
The structure of the native gene 5 protein is compatible with a model for cooperative binding based upon the association of complementary protein surfaces. Favorable chemical interactions inherent to the surface of the molecules are permitted expression by the liganding of nucleic acid. By an essentially mechanical effect transmitted through residues at the tip of a fJ loop, mutually attractive surfaces are permitted to merge. The formation of this interface includes the creation of a binding site for one additional nucleotide, a binding site otherwise absent. The interface utilizes hydrophobic interactions, salt bridges and hydrogen bonds to maintain its integrity.
Cooperativity at the macromolecular level in biological systems is rather poorly understood, but is of profound consequence. It is the basis not only for such things as allostery in enzyme systems, but the creation and operation of multiprotein complexes such as microtubules or the DNA replication complex. It almost certainly guides complicated processes such as translation of mRNA into protein and is the controlling influence of biological self assembly as occurs between like elements in virus capsids or unlike elements in the ribosome. Thus a detailed description of the underlying physical features producing cooperative binding of the gene 5 protein to DNA may well contribute to our understanding of this broad class of interactions.