Long-range Correlation and Partial 1/f^alpha Spectrum
in a Non-coding DNA Sequence
W. Li, K. Kaneko
Europhysics Letters, 17(7), 655--660 (1992)
Abstract
Mutual information function, which is an alternative to
correlation function for symbolic sequences, and a "symbolic
spectrum" are calculated for a human DNA sequence containing
mostly intron segments, those that do not code for proteins.
It is observed that the mutual information function of this
sequence decays very slowly, and the correlation length is extremely
long (at least 800 bases). The symbolic spectrum of the sequence
at very low frequencies can be approximated by 1/f^alpha,
where f is the frequency and alpha ranges from 0.5 to 0.85.
It is suggested that the existence of the repetitive patterns in
the sequence is mainly responsible for the observed long-range
correlation. A possible connection between this long-range
correlation and those in music notes is also briefly discussed.