Question

Alignment of RNASeq with STAR for variant calling

1

Entering edit mode

4 months ago

Gilles ▴ 10

I am trying to do a variant calling for specific genes on the SRA database of NCBI for Apis mellifera (+-8000 samples). This database consists both of whole genome sequencing samples and rna sequencing samples. I am using STAR to align the RNASeq samples to the reference genome however there is a large variation in mapping quality between the different samples (ranging from 5% to 95% of uniquely mapped reads). My question is whether it is recommended to filter out samples based on the quality of the alignment before variant calling and what threshold would be good?

Many thanks,
Gilles

STAR variant-calling RNA-seq alignment • 568 views

ADD COMMENT • link updated 4 months ago by Ram 43k • written 4 months ago by Gilles ▴ 10

0

Entering edit mode

Why are you using the RNA-seq data for variant calling when you have WGS data?

ADD REPLY • link 4 months ago by Ram 43k

0

Entering edit mode

The RNAseq samples either cover regions of the world that are not covered by the WGS samples or serve as additional data on top of my core WGS dataset.

ADD REPLY • link 4 months ago by Gilles ▴ 10

score 1 · Answer 1 · 2024-01-19

1

Entering edit mode

4 months ago

Ram 43k

OK, got it. There are GATK best practices for calling variants from RNA-seq data, but there are a lot of caveats as well. Here's the best practices page: https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels-

ADD COMMENT • link 4 months ago by Ram 43k