JCVI: Research / Projects / Annotation Service / Submission Guide
 
 
Section Banner

JCVI Annotation Service

Submission Guide

Please read the following information before submitting a genome to the Annotation Engine.

Acceptable genomes

The JCVI Annotation Service pipeline was built to process prokaryotic genomes. We are not currently accepting eukaryotic genomes. Viruses can be submitted, however, since our pipeline is optimized for prokaryotes the results for viruses may not be optimal.

Number of molecules

We will only take 5 or fewer molecules. If you have contigs or pieces of a molecule you must put them into a pseudomolecule.

Pseudomolecules

If your DNA sequence is in several contigs, concatenate the contigs together and send us the concatenated file. You can separate the contigs with the sequence:

NNNNNCACACACTTAATTAATTAAGTGTGTGNNNNN

This puts stop codons in all six reading frames.

Unfinished genomes/reannotation

You may send unfinished genomes into the JCVI Annotation Service (see pseudomolecule information above). Once the genome is complete you can resubmit your genome a second time for final annotation. However, we ask that you only submit the same genome at most two times. If you resubmit a genome all automated annotation from the first submission will be deleted and completely regenerated. We are not able to copy any manual annotation that you might have completed from the first submission onto the second submission's automated annotation.

File format

Please make sure your fasta file is saved as TEXT not as a WORD file (or any other word processor type file).

Other than A, T, G, or C the only other acceptable character in your sequence is N, please remove or replace any X's, spaces, or any other character which might be in your sequence. Characters other than A, T, G, C, or N will result in a delay in getting your data.

Limits on submissions

We do not currently have limits on the number of genomes that a user may submit to the JCVI Annotation Service. However, we request that you do not submit a large number of genomes (more than 3 or 4) all at once. Please note that if you are requesting more than two genomes it will take longer to get your data.

Amount of time to finish your genome

It usually will take four weeks to finish the whole JCVI Annotation Service process. You will then be emailed your results.

Data confidentiality

The data that is generated will be a database that is only accessible by the database administrators and the JCVI Annotation Service team. We will not put the data on the CMR or make it public in any other way until after the genome has been completed, published, and submitted to GenBank. At that point we would delete whatever we have in our database and re-populate using the data taken from the public GenBank files. Then the data would show up on the CMR a few months later. If your genome is never submitted to GenBank it will not show up on the CMR unless we have your permission. The annotation engine data we generate will be given to you through a private FTP site, which will be deleted a month after delivery of your automated annotation.

Data supplied to the Annotation Engine user

Your data will be given back to you in three formats:

  1. A MySQL database plus associated files which can then be used with TIGR's manual annotation tool Manatee.
  2. A GenBank-style file of your JCVI Annotation Service data. This can be used with other tools such as Sanger's Artemis.
  3. A tab delimited file of the JCVI Annotation Service data. Open this file in Excel or another spreadsheet to easily view a summary of your data.

Annotation training

An additional resource offered by JCVI is the Prokaryotic Annotation and Analysis course. Although not required, we highly recommend that researchers who submit genomes to the JCVI Annotation Service attend this course to acquaint themselves with the process of prokaryotic annotation as done at JCVI. This course is offered 4 times a year and gives detailed instruction on JCVI's annotation pipeline, JCVI's manual annotation tool Manatee, and the use of JCVI's Comprehensive Microbial Resource (CMR).

More details on the JCVI Annotation Service

Further information and documentation of the JCVI Annotation Service is available here.

Data submission

If you are ready to submit a genome click here to fill out the submission form for your genome.