Long-range Correlations in Nucleotide Sequences

C-K. Peng, S.V. Buldyrev, A.L. Goldberger, S. Havlin, F. Sciortino, M. Simon, and H.E. Stanley

Nature , 356, 168--170 (March 12, 1992)

Abstract

DNA sequences have been analyzed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a "DNA walk". We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intro-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.