Hi,
I am quite new to Bioinformatics. I wanted to ask if anyone has any idea as I am kind of stuck.
I am using Badreads tool for simulating Oxford Nanopore long reads and then used BWA for mapping the short reads to these simulated long reads where these set of long reads acts as my reference in BWA.
My long reads (while mapping in BWA my reference) have IDs like LR1, LR2,....LR16047. Next I want to score each long read by the number of characters it has mapped divided by the total length of that long read. My idea is to take a binary array (size equal to a Long Read) initialized to 0 and change the values to 1 at the places where the long read has been mapped and then count the number of 1's present. In the sam file I think the 4th column is giving me the position of the reference where it has been mapped. Is it the start or end position as I need both positions. Is there a way to find all the positions of all the Long Reads (the reference in BWA) which has been mapped?
Thank You.