Alignment of Phylogenetically Unambiguous Indels In Shewanella
J Comput Biol. 2009 Nov 01; 16(11): 1517-28.
Abstract High levels of alignment errors associated with gaps have generally meant their exclusion from phylogenetic analysis. Conserved inserts and deletions (indels) may in some cases be less subject to errors than amino acid substitutions for inferring the history of genomes and identifying recently laterally transferred genes, but alignment error near gaps must be evaluated prior to using indels as phylogenetic characters. A method is presented for evaluating the phylogenetic unambiguity of gaps in multiple sequence alignments by allowing a defined amount of pairwise alignment ambiguity. This work considers the bacterial genus Shewanella, which is of particular interest for applications of bioremediation and environmental engineering. Understanding the genetic history of these species is vital for these applications. A set of pairwise dynamic programming alignments is constructed to test positions in multiple alignments for phylogenetic unambiguity, and a whole genome scan is done on protein sequences from 11 sequenced species of the bacterial genus Shewanella. The splits defined by phylogenetically unambiguous indels are then used as characters for phylogenetic analysis, and results are compared to whole genome Maximum Likelihood phylogeny. A comparable description of the history of the species is found, as well as a set of lateral gene transfer candidates undetectable by traditional analysis of amino acid substitutions. This analysis is applicable to other taxonomic units at all levels and has the potential to allow cataloging of clear genome-wide phylogenetic markers for taxonomic profiling down to the species level.