Genome Reference For Bam Files In Ensembl Vs Ucsc?
3
2
Entering edit mode
12.9 years ago

What is the recommended genome reference sequenceid format for the generation of BAM files to be attached remotely to UCSC and Ensembl? For what I've seen so far, it seems one like to have 'chr'N sequenceids, and the other doesn't want 'chr' in front of the sequence_id.

bam ensembl ucsc reference next-gen sequencing • 3.8k views
ADD COMMENT
2
Entering edit mode
12.9 years ago

I like having the 'chr' prefix because, it makes things easier for me to clearly identify the column for the chromosome, to join the informations with the UCSC data, to 'grep' a chromosome, etc... .

ADD COMMENT
2
Entering edit mode
12.9 years ago

as far as I know there is no proper recommendation for using any contig_id format, it's just a matter of being consistent inside your entire pipeline.

take for instance our own experience: we work with bed files defining our resequencing regions coming from Agilent based on UCSC's nomenclature (chrN), and we afterwards use annotation tools that use the same internal coding, so when we built our reference multifasta sequence from UCSC's fasta contigs we left their own nomenclature.

I guess you just have to decide which convention helps you the most, considering the advantages and disadvantages of one or the other like Pierre did.

ADD COMMENT
1
Entering edit mode
12.9 years ago

Probably we should ask both UCSC and ensembl web to give the option to use both. Or probably someone could come across with a way of adding synonyms when indexing the sequence so you can use several synonyms for the seq_id when using tabix or the custom tracks in a genome browser.

I am using bam files produced by different labs and I have the seq_ids for the reference as chr[0-9XY], [09XY] and even Chr[0-9XY], so I have three different references for using with tabix and other tools.

I also prefer to have a 'chr' prefix for the fasta, as far I don need to sort the multi fasta entries by chromosome ;-)

ADD COMMENT

Login before adding your answer.

Traffic: 1660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6