Recognition of Genes in DNA Sequence with Ambiguities

M. Borodovsky and J. McIninch

Biosystems, 30(1-3), 161-171 (1993)

Abstract

The search for genes in a newly sequenced DNA is a well known problem. Among other factors, the gene-searching process is hampered by a number of ambiguities which may remain unresolved experimentally for a long time. A computer method that is able to predict genes in a DNA sequence containing ambiguities has been developed, based on the non-homogeneous Markov chain technique. The reliability of the method has been tested using a set of sequences generated by a Monte-Carlo procedure and a set of 425 E. coli sequences with ambiguities introduced artificially.