Question

Unique alignments from the Y-chromosome

0

Entering edit mode

2.2 years ago

christoph.kreitzer93 • 0

Hi,

We have seq. data (targeted panel; ~500 genes) that was aligned with BWA-mem to the human reference hg19. We are interested in retrieving aligned reads (from respective bam files) that are UNIQUE to the Y-chromosome.

The problem: The Y-chromosome contains a lot of repetitive (duplicated) sequences (ampliconic regions) and many coding genes (roughly 78) have an X-linked homologue with 60-90% nucleotide similarity. Since we are interested in calling Y-chromosome losses we are - broadly speaking - interested in seq. coverage ratios between the tumor and the normal samples.

Given the nature of the Y-chromosome, there are now a few questions:

How can I filter aligned reads, so that I can guarantee that those are unique to the Y-chromosome?
does it make sense to filter those with MAPQ > 30 (to ensure that those reads unambiguously come from the Y-chromosome)
- e.g. samtools view -F 256 <bam.in>
- There are some FLAGS like (XA; alternative reads, etc.) described but not sure if those are still supported (the last documentation that I found was somewhat 10 years ago)
- Moreover, given the fact that the X- and Y chromosome show great similarity, how can I make sure that those reads do not equally map to the X-chromosome?
- Is there a way we can filter reads that map > 1 to the Y-chromosome

I know there is a lot of information on MAPQ values, etc. but also, a lot of ambiguity.

Many many thanks for your help and advice

samtools bwa-mem • 291 views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 2.2 years ago by christoph.kreitzer93 • 0