Interpolated Markov models for eukaryotic gene finding
Salzberg SL, Pertea M, Delcher AL, Gardner MJ, Tettelin H
Computational gene finding research has emphasized the development of gene finders for bacterial and human DNA. This has left genome projects for some small eukaryotes without a system that addresses their needs. This paper reports on a new system, GlimmerM, that was developed to find genes in the malaria parasite Plasmodium falciparum. Because the gene density in P. falciparum is relatively high, the system design was based on a successful bacterial gene finder, Glimmer. The system was augmented with specially trained modules to find splice sites and was trained on all available data from the P. falciparum genome. Although a precise evaluation of its accuracy is impossible at this time, laboratory tests (using RT-PCR) on a small selection of predicted genes confirmed all of those predictions. With the rapid progress in sequencing the genome of P. falciparum, the availability of this new gene finder will greatly facilitate the annotation process.