|Description||Properties of organisms whose genomes have been sequenced. These properties
include those whose state can be supported by inference from the genomic data
itself as well as those supported by experimental observations.
Properties are organized in a type of heirarchy called a Directed Acyclyic
Graph (a DAG) in which child terms may have multiple parent terms. Thus,
for instance, a property for the biosynthesis of a particular cofactor may
be found as a child of both "biosynthesis" and "protein cofactors".
Properties come in several types: PATHWAY properties describe metabolic
pathways in which a series of enzymes convert substrates into products to
produce desirable substances and/or degrade undesirable ones. A set of
these steps have been determined to be both 1) unambiguously indicative of
the property and 2) as universally detectable as is practical among the
species believed to have this pathway active. The presence or absence of
these steps in a genome are used to judge the status of the property.
Those genomes with all of the set present are given the "YES" state, those
with none are given the "NONE FOUND" state. Each pathway may be be given
a threshold number of detected steps below which the state "NOT SUPPORTED"
is assigned. Otherwise, the state "INCOMPLETE" is assigned. Under
certain circumstances of additional evidence (such as the presence of a
complete alternative pathway) a "NONE FOUND state may be strengthened to
"NO". SYSTEM properties work in essentially the identical way, except
they may not represent a metabolic pathway. In both of these cases the
"value" field will hold a number representing the fraction of steps used to
assign the state of the property which were found in the genome. MARKER
properties are similar except that only a single type of protein is
detected and serves as an indicator for the presence of a property. Here
the value field holds the number of times the marker was detected in the
genome. In all three of these types, other proteins outside of the set
chosen for determineing the state of the property may also be detected.
These may include proteins which are presently detectable only in some
species, auxilliary or variable components, regulatory components, etc.
Accessions of all of the detected proteins are stored for user retrieval.
LITERATURE properties are set based on experimental evidence, not on the
content of the sequenced genome. These properties may detect associated
proteins which are stored as above. These may also store a number in the
value field where apropriate.
QUANTITATIVE properties are calculated from the sequenced genome and store
a numeric quantity in the "value" field.
TAXONOMY properties store information about the taxonomic classification
of the organism. These are based on NCBIs taxonomy database and are
used for organizing the output of properties queries to look for trends
based on phylogeny.
SUMMARY properties are parent properties of one or more related properties
whose states are summarized in the state of the parent property.
CATEGORY properties do not have states, but organize the properties through
|[ 1 ]Haft DH, Selengut JD, Brinkac LM, Zafar N, White O Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics. Bioinformatics 2005 Feb 1;21(3):293-306. Epub 2004 Sep 3. PMID 15347579|
© J. Craig Venter Institute | Privacy Statement | Data Disclaimer