Clarke TH, Brinkac LM, Sutton G, Fouts DE
GGRaSP: A R-package for selecting representative genomes using Gaussian mixture models.
Bioinformatics (Oxford, England). 2018-04-14;
The vast number of available sequenced bacterial genomes occasionally exceeds the facilities of comparative genomic methods or is dominated by a single outbreak strain, and thus a diverse and representative subset is required. Generation of the reduced subset currently requires a priori supervised clustering and sequence-only selection of medoid genomic sequences, independent of any additional genome metrics or strain attributes.