Entropies and Lexicographic Analysis of Biosequences
H. Herzel, W. Ebeling, A.O. Schmitt, M.A. Jimenez-Montano
Chapter 2 in From Simplicty to Complexity in Chemistry
and Beyond , eds. A, Muller, A. Dress, F. Vogtle
(Vieweg-Verlag, Braunschweig, 1995)
Abstract
This paper is devoted to the statistical and linguistic analysis
of biosequences. Information-theoretical tools (Renyi entropies,
mutual information) are introduced and applied to selected DNA
sequences (yeast chromosome III, Epstein-Barr virus genome).
Moreover, several techniques for the detection of long-range
correlations are reviewed, and possible sources of such correlations
are discussed. Finally, we study grammar representations of
sequences and exemplify this approach by studying a fragment of
mouse DNA.