Biological Origins of Long-Range Correlations and Compositional Variations
in DNA
D. Larhammar, C.A. Chatzidimitriou-Dreismann
Nucleic Acids Research , 21(22), 5167--5170 (1993)
Abstract
The occurrence of certain long-range correlations between nucleotides in DNA sequences of living
organisms has recently been reported. The biological origin of these correlations was unknown. The correlations
were proposed to be concerned with fractal structure and differences between intron-containing and intron-less
sequences. We and others have reported that no consistent difference exists between intron-containing and
intron-less sequences. In agreement with this, we demonstrate here that the long-range correlations are
trivially equivalent to the varying ratio R between pyrimidines and purines (or any other nucleotide combinations)
in different regions of a DNA sequence. Moreover, we show that this variation of R has simple biological
explanations: Differences in base composition occur along most DNA sequences and are associated with (i)
simple repeats (ii) differences in codon composition (due to the amino acid composition in the encoded protein),
(iii) change of the direction of transcription (and thus also translation), and (iv) differences between protein- and
rRNA-encoding segments. Seven biological examples are given.