Journal of Theoretical Biology,
197(1):51-61 (March 7, 1999)
Abstract
Chargaff's first parity rule (%A=%T and %G=%C) is explained by the
Watson-Crick model for duplex DNA in which complementary base
pairs form individual accounting units. Chargaff's second parity rule
is that the first rule also applies to single strands of DNA. The limits
of accounting units in single strands were examined by moving
windows of various sizes along sequences and counting the relative
proportions of A and T (the W bases), and of C and G (the S bases).
Shuffled sequences account, on average, over shorter regions than
the corresponding natural sequence. For an E. coli segment, S base
accounting is, on average, contained within a region of 10 kb,
whereas W base accounting requires regions in excess of 100 kb.
Accounting requires the entire genome (190 kb) in the case of
Vaccinia virus, which has an overall "Chargaff difference" of only
0.086% (i.e. only one in 1162 bases does not have a potential pairing
partner in the same strand). Among the chromosomes of
Saccharomyces cerevisiae, the total Chargaff differences for the W
bases and for the S bases are usually correlated. In general,
Chargaff differences for a natural sequence and its shuffled
counterpart diverge maximally when 1 kb sequence windows are
employed. This should be the optimum window size for examining
correlations between Chargaff differences and sequence features
which have arisen through natural selection. We propose that
Chargaff's second parity rule reflects the evolution of genome-wide
stem-loop potential as part of short- and long-range accounting
processes which work together to sustain the integrity of various
levels of information in DNA.