Gene Recognition by Combination of Several Gene-Finding Programs

Katsuhiko Murakami and Toshihisa Takagi1

1Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai Minato-ku, Tokyo 108-8639 and
2 Central Research Laboratory, Hitachi Ltd, 1-280 Higashi-Koigakubo, Kokubunji-shi, Tokyo 185-8601, Japan

Bioinformatics, 14(8), 665-675 (September 1998)

Abstract

Motivation: A number of programs have been developed to predict the eukaryotic gene structures in DNA sequences. However, gene finding is still a challenging problem.

Results: We have explored the effectiveness when the results of several gene-finding programs were re-analyzed and combined. We studied several methods with four programs (FEXH, GeneParser3, GEN-SCAN and GRAIL2). By HIGHEST-policy combination method or BOUNDARY method, approximate correlation (AC) improved by 3-5% in comparison with the best single gene-finding program. From another viewpoint, OR-based combination of the four programs is the most reliable to know whether a candidate exon overlaps with the real exon or not, although it is less sensitive than GENSCAN for exon-intron boundaries. Our methods can easily be extended to combine other programs.

Availability: We have developed a server program (Shirokane System) and a client program (GeneScope) to use the methods. GeneScope is available through a WWW site ( http://gf.genome.ad.jp/).

Contact: katsu@ims.u-tokyo.ac.jp, takagi@ims.u-tokyo.ac.jp