Get Special Mutation Site'S Special Allele Reads
4
0
Entering edit mode
11.4 years ago

I have done GATK's UnifiedGenotyper and get some variations, One of these variations is really the one I'm interested in, that's a T>G mutation(T is the reference base, G is the query reads base), T:6890, while G gets 162.

I want to get the 162 reads with G on the mutation site, how can I done this, I have the vcf files and bam files which were generated by GATK. could any one kindly help me? thank you!

gatk • 2.2k views
ADD COMMENT
1
Entering edit mode
11.4 years ago

You can specify the region to output using samtools view reads.bam chrN:NNNNNNN-NNNNNNN. Just make sure you are using the correct format for the chromosome name (it may be chrN or just N, where N is the chromosome number). Send the samtools output to a file using stdout redirection, and you're almost there.

ADD COMMENT
1
Entering edit mode

how will he only get the 162 reads with G on the mutation site using samtools view ?

ADD REPLY
1
Entering edit mode

That's why you're "almost there". I was going to suggest using grep 'NNGN' on the output of the reads, but that is not guaranteed to be specific, or match all possible instances of this site. Your solution is clearly appropriate if all the OP wants to do is visually inspect the reads, but I suspect that "get the 162 reads" means actually getting the reads for a further operation.

ADD REPLY
0
Entering edit mode

Yes, with your help, I 'm almost there, I can write some scripts to check the special site's base to determine whether to report that read, but that would make me to spend lots of time, if there were some tools that could do this would be better.

ADD REPLY
0
Entering edit mode

You might use samtools mpileup to get base calls and quality scores for that site, but I'm not exactly sure what information you would like to "determine whether to report that read".

ADD REPLY
0
Entering edit mode
11.4 years ago

you can visually get those reads using IGV.

ADD COMMENT
0
Entering edit mode
11.4 years ago

You could grep the fastq for that sequence. You will only get the reads that perfectly match your grep sequence, but if a lot of the reads that have the alternate allele have other differences, that's worth knowing too.

ADD COMMENT
0
Entering edit mode
11.4 years ago

Another answer: I wrote a java tool for another question

position of mismatches per read from a sam/bam file

It might answers your needs.

ADD COMMENT

Login before adding your answer.

Traffic: 1762 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6