Chromosome Names in genome are incompatible with annotations
0
0
Entering edit mode
7.1 years ago
serpalma.v ▴ 80

Dear community,

while creating an index for the bovine genome with STAR, the process fails because the chromosome names in the annotation file (Bos_taurus.UMD3.1.87.gtf) are incompatible with the ones in the reference file (UMD3.1_chromosomes.fa) (e.g. for chromosome "10" vs "gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1", both should be "10").

Apparently, the solution is to change the names in the reference file. Could you suggest a tool that does this for me or a "one liner" that can transform the names into the chromosome number?

And also, would this affect downstream processing of my results?

I have searched through other threads and couldn't find a better answer than the one given here: Renaming Entries In A Fasta File But it renames chromosomes names in the reference file based on the order they appear.

Cheers!

STAR alignment • 1.5k views
ADD COMMENT
0
Entering edit mode
sed 's/gnl|UMD3.1|GK000010.2 Chromosome 10 AC_000167.1/10/'
ADD REPLY

Login before adding your answer.

Traffic: 2420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6