Genome Sequencing of Clinically Important Strains of Entamoeba Histolytica

The protozoan parasite Entamoeba histolytica causes invasive intestinal and extraintestinal infections in about 50 million people world-wide resulting in a death toll of up to 100,000 people annually. It remains a significant cause of human death in developing countries such as Bangladesh and Vietnam. However, four out of five E. histolytica infections remain asymptomatic. What determines the outcome of an E. histolytica infection is largely unknown.

The DNA content of E. histolytica grown in axenic cultures (in the absence of bacteria) is at least 10 fold higher than in xenic cultures. In turn, re-growth of axenized parasites in the presence of bacteria leads to reduction of DNA content to the original xenic values. A similar reversible increase in genomic DNA content has been described during passage from cyst to trophozoite in E. invadens, a model organism for encystations. There exists evidence to suggest that variation in genome content is the result of the accumulation of multiple copies of the genome (polyploidy), although the expansion or contraction of specific regions of the genome or some other rearrangements could be possible.

To date, there is only one available genome sequence of an E. histolytica strain (HM1:IMSS) and the genome is largely unclosed and fragmented. This project involves sequencing 12 representative strains from 3 clinical groups:

  1. Asymptomatic,
  2. diarrhea/dysentery, and
  3. amoebic liver abscess, including paired strain groups of axenic vs. xenic groups, and cyst-derived vs. trophozoites.

The selection of these 12 strains will provide a much broader representation of species variation. The use of next generation sequencing technologies (454 and Illumina) will allow sequencing of challenging regions of the genome not adequately sequenced. The goal of this project is to identify genomic regions that are responsible for the variable outcome of infection, and to contribute to the overall improvement and closure of the current reference genome.

White Paper Access

The initial white paper submitted can be downloaded here. Since white papers are not always approved exactly as submitted, this document may not exactly describe the final form of the project. Please contact if you have any questions.


This project has been funded in whole or part with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under contract numbers N01-AI30071 
and/or HHSN272200900007C.


William A. Petri Jr. PhD
Professor, University of Virginia

Ibne Karim M. Ali, PhD
University of Virginia


Related Research