Ncbi Wgs/Nt/Env-Nt Databases
1
1
Entering edit mode
13.3 years ago
Lythimus ▴ 210

I am currently BLASTning against NCBI's NT database but I am considering also using WGS and ENV-NT. I was given the impression that WGS was populated by pulling from ENV-NT if the sequence was definitively classified to a specfific organism but after looking at the file sizes it seems the reverse. Could someone explain to me clearly the differences in NT, ENV-NT and WGS and maybe give an example of when I would and possibly wouldn't want to use specific databases or sets of databases?

Just assume whatever domain you are most familiar with and use those in your examples please.

ncbi database nucleotide blast • 11k views
ADD COMMENT
7
Entering edit mode
13.3 years ago
Alex ★ 1.5k

Useful page with NCBI databases description

nt contains all GenBank + RefSeq Nucleotides + EMBL + DDBJ + PDB sequences (excluding HTGS0,1,2, EST, GSS, STS, PAT, WGS). No longer "non-redundant".

wgs is collection of partially assembled sequences from the genome centers. These are contigs assembled directly from whole genome shotgun sequencing.

env_nt contains DNA sequenced directly from the environment (from all organisms mixed together, e.g. Sargasso Sea and Mine Drainage projects)

Additionaly check:

est contains sequence data and other information on "single-pass" cDNA sequences, or "Expressed Sequence Tags". More here http://www.ncbi.nlm.nih.gov/pubmed/8401577?dopt=Abstract

htgs - unfinished High Throughput Genomic Sequences: phases 0, 1 and 2 (finished, phase 3 HTG sequences are in nr). About phases here.

gss - Genome Survey Sequence, includes single-pass genomic data, exon-trapped sequences, and Alu PCR sequences. More here.

sts - contains sequence data for short genomic landmark sequences or Sequence Tagged Sites. More here

ADD COMMENT
1
Entering edit mode

All WGS sequences - yes (genome project), ENV_NT sequeneces - usually not.

ADD REPLY
0
Entering edit mode

This may be a silly question. Are sequences from WGS and ENV_NT taxonomically classified (as in NT) and aren't solely associated with the environment from which they were harvested?

ADD REPLY

Login before adding your answer.

Traffic: 2703 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6