TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects
Genome Science and Technology. 1995 Jan 01; 1(1): 9-19.
A new approach to assembling large, random shotgun sequencing projects has been developed. The TIGR Assembler overcomes several major obstacles to assembling such projects: the large number of pairwise comparisons required, the presence of repeat regions, chimeras introduced in the cloning process, and sequencing errors. A fast initial comparison of fragments based on oligonucleotide content is used to eliminate the need for a more sensitive comparison between most fragment pairs, thus greatly reducing computer search time. Potential repeat regions are recognized by determining which fragments have more potential overlaps than expected given a random distribution of fragments. Repeat regions are dealt with by increasing the match criteria stringency and by assembling these regions last so that maximum information from non repeat regions can be used. The algorithm also incorporates a number of constraints, such as clone length and the placement of sequences from the opposite ends of a clone. TIGR Assembler has been used to assemble the complete 1.8 Mbp Haemoplrilus influenzae (Fleischmann et al., 1995) and 0.58 Mbp Mycoplasma genitalium (Fraser et al., 1995) genomes.