Search

Content Type
Publication

TheViral MetaGenome Annotation Pipeline(VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data.

In the past few years, the field of metagenomics has been growing at an accelerated pace, particularly in response to advancements in new sequencing technologies. The large volume of sequence data from novel organisms generated by metagenomic projects has triggered the development of specialized databases and tools focused on particular groups of organisms or data types. Here we describe a pipeline for the functional annotation of viral metagenomic sequence data. The Viral MetaGenome Annotation...


Publication

VIGOR extended to annotate genomes for additional 12 different viruses.

A gene prediction program, VIGOR (Viral Genome ORF Reader), was developed at J. Craig Venter Institute in 2010 and has been successfully performing gene calling in coronavirus, influenza, rhinovirus and rotavirus for projects at the Genome Sequencing Center for Infectious Diseases. VIGOR uses sequence similarity search against custom protein databases to identify protein coding regions, start and stop codons and other gene features. Ribonucleicacid editing and other features are accurately...


Publication

Optimizing read mapping to reference genomes to determine composition and species prevalence in microbial communities.

The Human Microbiome Project (HMP) aims to characterize the microbial communities of 18 body sites from healthy individuals. To accomplish this, the HMP generated two types of shotgun data: reference shotgun sequences isolated from different anatomical sites on the human body and shotgun metagenomic sequences from the microbial communities of each site. The alignment strategy for characterizing these metagenomic communities using available reference sequence is important to the success of HMP...


Publication

Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community.

Several viruses within the Coronaviridae family have been categorized as either emerging or re-emerging human pathogens, with Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) being the most well known. The NIAID-sponsored Virus Pathogen Database and Analysis Resource (ViPR, www.viprbrc.org) supports bioinformatics workflows for a broad range of human virus pathogens and other related viruses, including the entire Coronaviridae family. ViPR provides access to sequence records, gene and...


Publication

TIGRFAMs and Genome Properties in 2013.

TIGRFAMs, available online at /tigrfams is a database of protein family definitions. Each entry features a seed alignment of trusted representative sequences, a hidden Markov model (HMM) built from that alignment, cutoff scores that let automated annotation pipelines decide which proteins are members, and annotations for transfer onto member proteins. Most TIGRFAMs models are designated equivalog, meaning they assign a specific name to proteins conserved in function from a common ancestral...


Publication

Next generation sequencing to define prokaryotic and fungal diversity in the bovine rumen.

A combination of Sanger and 454 sequences of small subunit rRNA loci were used to interrogate microbial diversity in the bovine rumen of 12 cows consuming a forage diet. Observed bacterial species richness, based on the V1-V3 region of the 16S rRNA gene, was between 1,903 to 2,432 species-level operational taxonomic units (OTUs) when 5,520 reads were sampled per animal. Eighty percent of species-level OTUs were dominated by members of the order Clostridiales, Bacteroidales, Erysipelotrichales...


Publication

Whole genome studies of Tetrahymena.

Within the past decade, genomic studies have emerged as essential and highly productive tools to explore the biology of Tetrahymena thermophila. The current major resources, which have been extensively mined by the research community, are the annotated macronuclear genome assembly, transcriptomic data and the databases that house this information. Efforts in progress will soon improve these data sources and expand their scope, including providing annotated micronuclear and comparative genomic...


Publication

AntiFam: a tool to help identify spurious ORFs in protein annotation.

As the deluge of genomic DNA sequence grows the fraction of protein sequences that have been manually curated falls. In turn, as the number of laboratories with the ability to sequence genomes in a high-throughput manner grows, the informatics capability of those labs to accurately identify and annotate all genes within a genome may often be lacking. These issues have led to fears about transitive annotation errors making sequence databases less reliable. During the lifetime of the Pfam protein...


Publication

Analysis of interspecies adherence of oral bacteria using a membrane binding assay coupled with polymerase chain reaction-denaturing gradient gel electrophoresis profiling.

Information on co-adherence of different oral bacterial species is important for understanding interspecies interactions within oral microbial community. Current knowledge on this topic is heavily based on pariwise coaggregation of known, cultivable species. In this study, we employed a membrane binding assay coupled with polymerase chain reaction-denaturing gradient gel electrophoresis (PCR-DGGE) to systematically analyze the co-adherence profiles of oral bacterial species, and achieved a more...


Publication

The natural product domain seeker NaPDoS: a phylogeny based bioinformatic tool to classify secondary metabolite gene diversity.

New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments....