Award #0604966 Award #0821966
The Medicago truncatula sequencing project was initiated with
a generous grant from Samuel Roberts Noble Foundation
to the University of Oklahoma.
Beginning in 2003 (and renewed in 2006), the National Science Foundation and the European Union's
Sixth Framework Programme provided funding to complete sequencing of the euchromatic genespace.
Among the eight chromosomes in Medicago, six were sequenced in the US by the NSF-funded projects,
one (chromosome 5) was sequenced in France by Genoscope with funding from the EU and INRA, and
one (chromosome 3) was sequenced in the United Kingdom with funding from the EU and BBSRC.
Pseudomolecule assembly was greatly aided by the construction of an Optical Map by David Schwartz and collueages at Laboratory for Molecular and Computational Genomics at UW-Madison. Genome annotation was carried out by the International Medicago Genome Annotation Group (IMGAG), which involved participants from INRA-CNRS, JCVI/TIGR, NCGR, MIPS, MPIZ, UMN and VIB-Gent. With the publication of the Genome Paper in Nature, November 2011 the genome project was essentially completed.
At JCVI, we have continued to curate and improve the M. truncatula genome sequence and annotation. We have combined NexGen sequences with the previous BAC-based assemblies to produce the current Mt 4.0 release. Pseudomolecules were constructed from ALLPATHS-LG scaffolds on the basis of alignments to both the Optical and Genetic (Genotyping-by-Sequencing) maps. Where possible, high quality contiguous BAC sequences were patched into the new pseudomolecules to close sequencing gaps in the ALLPATHS assembly. Whereas the Mt3.5 release consisted of ~250 Mb in pseudomolecules and ~100 Mb of unanchored sequence, the Mt4.0 pseudomolecules now encompass approximately 360 Mb of sequences spanning 390 Mb of which ~330 Mb aligns accurately with the Optical Map. Most of the sequences and genes that were previously in the unanchored portion of Mt3.5 have now been incorporated into Mt4.0 pseudomolecules, with the exception of only ~20Mb of unplaced sequence.
The new pseudomolecules were annotated by an in-house pipeline that combined Mt3.5 gene models, predictions from Augustus and FGENESH with expression data and protein matches primarily using Evidence Modeler (EVM). Medtr identifers have been preserved between Mt3.5 and Mt4.0. Many new identifiers have been instantiated to replace the gene identifiers previously found on the unanchored contigs. Lookup tables are provided to allow easy navigation between the two data sets.