Starts of Bacterial Genes: Estimating the Reliability
of Computer Predictions
a Dmitrij Frishman,
b Andrei Mironov,
c Mikhail Gelfand
a GSF-Forschungszentrum f. Umwelt und Gesundheit, Munich Information
Center for Protein Sequences am Max-Planck-Institut für Biochemie, Am
Klopferspitz 18, D-82152 , Martinsried, Germany
b
Laboratory of Mathematical Methods, National Center for Biotechnology
Information NIIGENETIKA, , Moscow 113545, Russia
c
Institute of Protein Research, Russian Academy of Sciences, , Pushchino
142292, Russia
Gene,
234(2):257 - 165 (July 8, 1999)
Abstract
Exact mapping of gene starts is an important problem in the computer-assisted
functional analysis of newly sequenced prokaryotic genomes. We describe an
algorithm for finding ribosomal binding sites without a learning sample. This
algorithm is particularly useful for analysis of genomes with little or no
experimentally mapped genes. There is a clear correlation between the ribosomal
binding site (RBS) properties of a given genome and the potential gene start
prediction accuracy. This correlation is of considerable predictive power and may
be useful for estimating the expected success of future genome analysis efforts.
We also demonstrate that the RBS properties depend on the phylogenetic position
of a genome.