Extract read names from BAM files
4
1
Entering edit mode
5.1 years ago
Mathias_H ▴ 20

Hello everyone, I need to extract the names of the reads in a BAM-file. The result should be a text-file with all the read names. I did not find any command in samtools or picard which would do this task. I am working with R, so it would be best to have something which is implementable in an R pipeline.

Can anyone help me?

RNA-Seq R • 8.7k views
ADD COMMENT
3
Entering edit mode
5.1 years ago
GenoMax 141k

Not a solution in R but this should do it:

samtools view your.bam | cut -f1 | sort | uniq > read_names
ADD COMMENT
0
Entering edit mode

Thank you! Looks like it worked.

ADD REPLY
2
Entering edit mode
5.1 years ago
Asaf 10k

samtools view file.bam |cut -f 1 ?

ADD COMMENT
2
Entering edit mode
5.1 years ago

I got this off of stack exchange; it's faster, because it doesn't sort

samtools view mine.bam | cut -f 1 | awk '!x[$0]++' > read.names.txt
ADD COMMENT
0
Entering edit mode

Please add the link to Stack Exchange post.

ADD REPLY
2
Entering edit mode
5.1 years ago

If your bam file is very light, in R you can do :

library(Rsamtools)
bam <- scanBam("input.bam")
bam[[1]]$qname

Otherwise, command line answers are preferred

ADD COMMENT

Login before adding your answer.

Traffic: 2577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6