Hi, I ran the following code in samtools version 1.3.1 and I am confused by the output, leaving me a little worried that I have missing data. So here's hoping someone can prove me wrong or right!
----------
samtools mpileup -q 250 -Q 20 -f dmel-all-chromosome-r6.01.fasta -b Sample1Sample2.txt | gzip > Sample1Sample2.mpileup.gz
----------
I set -q 250 because the RNA aligner I used, STAR, sets mapping quality to 250 for unique alignments. The files listed in Sample1Sample2.txt were two bam files, one for each sample. The reason I am confused about the output is the presence of < and > symbols. Here's an example of a line of my mpileup:
2R 14442944 G 48 >><>>><<<<<<<<<><<<><<<<><<<<<>><>>><><<>>>>>><. FFBFFFF<FFFFFFFFFFFFFFFFFFFFFFFBBFFFFFFFFFF<FFFF
75 <<<><<><<<<<<<<<<<<<<<<<<<<<<<<<>>><<<<<<<<<>>>><<>>>>>>>>>><>><>><>><><>>< FFF<FFF<FFBBBBBBBFBBBFBFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFF<FBFFFFFFFFFFFF
Why do < and > appear to be in the fifth column with base calls (eg. ".") rather than in the sixth column as depicted in this article? I thought it might be something to do with symbolising spliced reads, as I am working with RNAseq...but I just don't know.
Thanks in advance, Chris
(Edit: sorry, the text editor is making the line of pileup look funny, ignore the br= and p= stuff)
visialize with IGV and check for insertion/ deletion at: 2R:14442944