Tool for replacing variant-bases with reference in aligned reads, i.e anonymize bam files?
0
0
Entering edit mode
7.8 years ago

Hi, I am looking for a tool that takes as input a bamfile with aligned reads and the reference genome and outputs a bamfile where every variant (non-reference basecall) is replaced with the reference base call, but the alignment is kept.

I need this in order to de-senitize bam files so I am allowed to distribute them more freely, typically in troubleshooting situations where alignment is more important than variants.

I was not able to find such a tool or option in familiar tools and while writing this I realize it might be a bit more tricky than I thought; what to do with indels?

Feedback appreciated.

RNA-Seq • 1.5k views
ADD COMMENT
1
Entering edit mode

It sounds similar to "Create a dummy bam file from a bed coordinates and ref fasta." where the bed coordinates can be obtained from existing bam.

Not only about indels, what will you do with base quality score when you replace with reference ? Especially when you encounter a 'N' in your bam read.

ADD REPLY
0
Entering edit mode

After mapping the base quality doesn't really matter anymore, assuming you are only going to use this edited bam for differential expression analysis...

ADD REPLY

Login before adding your answer.

Traffic: 1492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6