Library preparation methodology can influence genomic and functional predictions in human microbiome research
Jones MB, Highlander SK, Anderson EL, Li W, Dayrit M, Klitgord N, Fabani MM, Seguritan V, Green J, Pride DT, Yooseph S, Biggs W, Nelson KE, Venter JC
Observations from human microbiome studies are often conflicting or inconclusive. Many factors likely contribute to these issues including small cohort sizes, sample collection, and handling and processing differences. The field of microbiome research is moving from 16S rDNA gene sequencing to a more comprehensive genomic and functional representation through whole-genome sequencing (WGS) of complete communities. Here we performed quantitative and qualitative analyses comparing WGS metagenomic data from human stool specimens using the Illumina Nextera XT and Illumina TruSeq DNA PCR-free kits, and the KAPA Biosystems Hyper Prep PCR and PCR-free systems. Significant differences in taxonomy are observed among the four different next-generation sequencing library preparations using a DNA mock community and a cell control of known concentration. We also revealed biases in error profiles, duplication rates, and loss of reads representing organisms that have a high %G+C content that can significantly impact results. As with all methods, the use of benchmarking controls has revealed critical differences among methods that impact sequencing results and later would impact study interpretation. We recommend that the community adopt PCR-free-based approaches to reduce PCR bias that affects calculations of abundance and to improve assemblies for accurate taxonomic assignment. Furthermore, the inclusion of a known-input cell spike-in control provides accurate quantitation of organisms in clinical samples.