Entropies of Biosequences: The Role of Repeats
Hanspeter Herzel, Werner Ebeling, Armin O. Schmitt
Physical Review E, 50(6), 5061--5071 (1994)
Abstract
DNA sequences of higher organisms contain thousands of nearly
identical dispersed repetitive sequences. In order to understand
the effect of such repeats on word entropies, we construct a
model that can be analyzed analytically. The hypothetical
model sequences consist of independent equidistributed
symbols with randomly interspersed repeats. As a conculsion,
we predict that the entropy of DNA sequences measuring
the information content is much lower than suggested
by earlier empirical studies.