The present study reports on the solution structure of the guanine plus adenine rich d(A(2)G(2)T(4)A(2)G(2)) 12-mer sequence which forms a unique fold in moderate NaCl solution. Proton resonance assignments for this sequence, which contains a pair of AAGG repeats separated by a T(4) linker segment, were aided by site-specific (15)N-labeling of guanine and adenine bases, as well as site-specific incorporation of 2,6-diaminopurine and 8-bromoadenine for adenine, 8-bromoguanine, 7-deazaguanine and inosine for guanine, and uracil and 5-bromouracil for thymine. The solution structure, which was solved by a combined NMR and intensity-refined computational approach, consists of a diamond-shaped architecture formed through dimerization of a pair of d(A(2)G(2)T(4)A(2)G(2)) hairpins. This 2-fold symmetric structure contains a quadruplex core consisting of a pair of symmetry-related G(syn).G(syn).G(anti). G(anti) tetrads, where adjacent strands have both parallel and anti-parallel neighbors and connecting T(4) segments which form diagonal loops. Each of the G(syn).G(syn).G(anti).G(anti) tetrads forms a platform on which stacks a T(anti).[A(syn)-A(anti)] triad containing a novel A(syn)-A(anti) platform step and a reversed Hoogsteen A(syn).T(anti) pair. We observe both base-base and base-sugar stacking interactions, with the latter occuring at a sheared A-G step where the sugar of the A stacks on the purine plane of the G. Unexpectedly, the topology of this sheared A(anti)-G(syn) step has many similarities with the C(anti)-G(syn) step in left-handed Z-DNA. The T.(A-A) triad is sandwiched between the G-tetrad on one side and a reversed Hoogsteen A(anti).T(anti) pair on the other. This intercalative topology is facilitated by a zipper-like motif where the A(anti) residue of the triad is interdigitated within a stretched A(anti)-G(syn) step. Our structural study reports on new aspects of A-A platforms, base triads, zipper-like interdigitation and sheared base steps, together with base-base and base-sugar stacking defining a diamond-like architecture for the d(A(2)G(2)T(4)A(2)G(2)) sequence. One can anticipate that mixed guanine-adenine sequences will exhibit a rich diversity of polymorphic architectures that will provide unique topologies for recognition by both nucleic acids and proteins.
Copyright 2000 Academic Press.