how to set reporting options for RNA-seq reads alignment with hisat2?
1
0
Entering edit mode
7.7 years ago
epigene ▴ 590

Hi I'm new to analyzing RNA-seq data. I started with using hisat2 to align RNA-seq reads. I think my main goal is to do differential gene expression analysis comparing multiple control samples vs case samples.

I run through a test run with hisat2 with basically these options: --dta-cufflinks --rna-strandness. I realized that the number of alignments in the bam file is more than the number of reads in the original fastq file. Puzzled by this, I searched around and realize that there is this option -k with a default value of 5. So there could be up to 5 alignments of one read.

-k <int>
It searches for at most <int> distinct, primary alignments for each read. Primary alignments mean alignments whose alignment score is equal or higher than any other alignments.

I think this is the reasons for the number of alignments being more than number of reads.

So I'm curious what would be the ideal -k to use and how this option impact downstream analysis with gene counts etc?

Thanks!

RNA-Seq hisat2 • 2.3k views
ADD COMMENT
1
Entering edit mode
7.7 years ago

Gene counts will commonly ignore multimapping reads. That's a pitty, but a sensible decision since these cannot properly get attributed to a certain gene. However, you can rescue some using the method specified here: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0734-x

Only if you are confident in what you are doing you should change the default values. If you don't know what the ideal value is, the default is properly just fine. If else it wouldn't be the default.

ADD COMMENT
0
Entering edit mode

Thanks for the input. I'm not so confident in what I'm doing yet as I don't have a good understanding of what each step works yet. If downstream analysis ignore multimypping reads, then this option won't affect them later.

ADD REPLY

Login before adding your answer.

Traffic: 1492 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6