Chromosome-Level Genome Assembly of a Human Fungal Pathogen Reveals Synteny among Geographically Distinct Species
Voorhies M, Cohen S, Shea TP, Petrus S, Muñoz JF, Poplawski S, Goldman WE, Michael TP, Cuomo CA, Sil A, Beyhan S
Histoplasma capsulatum, a dimorphic fungal pathogen, is the most common cause of fungal respiratory infections in immunocompetent hosts. is endemic in the Ohio and Mississippi River Valleys in the United States and is also distributed worldwide. Previous studies have revealed at least eight clades, each specific to a geographic location: North American classes 1 and 2 (NAm 1 and NAm 2), Latin American groups A and B (LAm A and LAm B), Eurasian, Netherlands, Australian and African, and an additional distinct lineage (H81) comprised of Panamanian isolates. Previously assembled genomes are highly fragmented, with the highly repetitive G217B (NAm 2) strain, which has been used for most whole-genome-scale transcriptome studies, assembled into over 250 contigs. In this study, we set out to fully assemble the repeat regions and characterize the large-scale genome architecture of species. We resequenced five strains (WU24 [NAm 1], G217B [NAm 2], H88 [African], G186AR [Panama], and G184AR [Panama]) using Oxford Nanopore Technologies long-read sequencing technology. Here, we report chromosomal-level assemblies for all five strains, which exhibit extensive synteny among the geographically distant isolates. The new assemblies revealed that , a major regulator of morphology and virulence, is duplicated in G186AR. In addition, we mapped previously generated transcriptome data sets onto the newly assembled chromosomes. Our analyses revealed that the expression of transposons and transposon-embedded genes are upregulated in yeast phase compared to mycelial phase in the G217B and H88 strains. This study provides an important resource for fungal researchers and further highlights the importance of chromosomal-level assemblies in analyzing high-throughput data sets. species are dimorphic fungi causing significant morbidity and mortality worldwide. These fungi grow as mold in the soil and as budding yeast within the human host. can be isolated from soil in diverse regions, including North America, South America, Africa, and Europe. Phylogenetically distinct species of have been isolated and sequenced. However, for the commonly used strains, genome assemblies have been fragmented, leading to underutilization of genome-scale data. This study provides chromosome-level assemblies of the commonly used strains using long-read sequencing technology. Comparative analysis of these genomes shows largely conserved gene order within the chromosomes. Mapping existing transcriptome data on these new assemblies reveals clustering of transcriptionally coregulated genes. The results of this study highlight the importance of obtaining chromosome-level assemblies in understanding the biology of human fungal pathogens.