Unique alignments from the Y-chromosome
0
0
Entering edit mode
2.2 years ago

Hi,

We have seq. data (targeted panel; ~500 genes) that was aligned with BWA-mem to the human reference hg19. We are interested in retrieving aligned reads (from respective bam files) that are UNIQUE to the Y-chromosome.

The problem: The Y-chromosome contains a lot of repetitive (duplicated) sequences (ampliconic regions) and many coding genes (roughly 78) have an X-linked homologue with 60-90% nucleotide similarity. Since we are interested in calling Y-chromosome losses we are - broadly speaking - interested in seq. coverage ratios between the tumor and the normal samples.

Given the nature of the Y-chromosome, there are now a few questions:

  • How can I filter aligned reads, so that I can guarantee that those are unique to the Y-chromosome?
  • does it make sense to filter those with MAPQ > 30 (to ensure that those reads unambiguously come from the Y-chromosome)
    • e.g. samtools view -F 256 <bam.in>
    • There are some FLAGS like (XA; alternative reads, etc.) described but not sure if those are still supported (the last documentation that I found was somewhat 10 years ago)
    • Moreover, given the fact that the X- and Y chromosome show great similarity, how can I make sure that those reads do not equally map to the X-chromosome?
    • Is there a way we can filter reads that map > 1 to the Y-chromosome

I know there is a lot of information on MAPQ values, etc. but also, a lot of ambiguity.

Many many thanks for your help and advice

samtools bwa-mem • 291 views
ADD COMMENT

Login before adding your answer.

Traffic: 2284 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6