A Standard Deviation Based Quantification Differentiates
Coding from Non-coding DNA Sequences and Gives Insight
to Their Evolutionary History
Y Almirant
Journal of Theoretical Biology,
196(3), 297-308 (Feb 7, 1999)
Abstract
A method quantifying the randomness of nucleotide sequences is developed,
based on the introduction of a standard deviation type of quantity involving
locally computed means and a length scale around which is assessed the
clustering of nucleotides. It is pointed out that the value taken by this modified
standard deviation may distinguish between coding rich and non-coding rich
sequences. Moreover, the approach described herein allows the determination
of some minimal characteristics of an evolutionary scenario which can account
for the origin of the clustering in the nucleotide distribution of the different
parts of the genome.