Clarke TH, Brinkac LM, Sutton G, Fouts DE

GGRaSP: A R-package for selecting representative genomes using Gaussian mixture models.

Bioinformatics (Oxford, England). 2018-04-14;

The vast number of available sequenced bacterial genomes occasionally exceeds the facilities of comparative genomic methods or is dominated by a single outbreak strain, and thus a diverse and representative subset is required. Generation of the reduced subset currently requires a priori supervised clustering and sequence-only selection of medoid genomic sequences, independent of any additional genome metrics or strain attributes.

PMID: 29668840