- Overview
- Assembly
- Annotation
- IMGAG Annotation Pipeline
- Illumina Contig Annotation
- Statistics
- Gene Families
- Repeats
- Medicago GBrowse
- Community Annotation
- Mutant Information
- Nomenclature
- References
- GBrowse
- Gene Expression
- Search
- Resources
- News

- Help
![]() |
Award #0321460 Award #0604966 Award #0821966 |
Gene Families
Two methods were used to cluster proteins into families. The gene set used does not contain the genes classified as Transposable Elements (TEs).
TribeMCL clusters families by first running an all-versus-all BLAST search and parsing the output through matrices to generate the families. The ten largest families include:
Family ID |
Num. Family Members |
Name |
1 |
861 |
Receptor-like protein kinase |
2 |
840 |
Disease resistance-like protein |
3 |
732 |
Unknown protein/Cytochrome P450 |
4 |
564 |
Leucine Rich Repeat family protein |
5 |
335 |
Helicase-like protein |
6 |
331 |
Cytochrome P450 |
7 |
325 |
NBS-LRR disease resistance protein |
8 |
317 |
Pentatricopeptide repeat-containing protein |
9 |
283 |
Pentatricopeptide repeat-containing protein |
10 |
237 |
1-aminocyclopropane-1-carboxylate oxidase |
The JCVI Paralogous Families pipeline clusters proteins into families based on domain composition. Domains are first identified by HMM search, and then by BLAST homology. The family members all contain the same domain architecture.
Family ID |
Num. Family Members |
Name |
1 |
562 |
Unknown protein |
2 |
304 |
Receptor-like protein kinase |
3 |
247 |
Unknown protein |
4 |
168 |
F-box/kelch-repeat protein |
5 |
155 |
Helicase-like protein |
6 |
150 |
Unknown protein |
7 |
150 |
CCP |
8 |
132 |
Unknown protein |
9 |
128 |
Pentatricopeptide repeat-containing protein |
10 |
121 |
Pentatricopeptide repeat-containing protein |

