Biostar Beta. Not for public use.
Question: Virus RNA Seq reads mapping with salmon
0
Entering edit mode

I did RNA seq of the mammalian cells infected with pox virus. Now, I have read files which contains both host and virus reads. I want to align the reads both to host and viral genome. I was thinking I could concatenate the host and virus genome into one file and run salmon against this concatenated genome. However, salmon recommends transcriptome file for assembly which are not available for viruses. Virus genome are available as genebank or gff3 format in NCBI. Is there any way I can concatenate these formats into the format that can be used by salmon ? Or is there any way around to use virus genome as reference in salmon ?

Thanks

ADD COMMENTlink 11 months ago lokraj2003 • 70 • updated 11 months ago Adrian Pelin ♦ 2.3k
1
Entering edit mode

You provide Salmon with a transcriptome fasta... so merging the human transcriptome with the pox virus genome fasta file should work.

Note: I linked to the coding region sequences - if you use my suggestion, it might also be wise to include the ncRNA as well.

ADD COMMENTlink 11 months ago benformatics • 870
Entering edit mode
0

Thanks for the suggestion and links. I will probably try with ncRNA too.

ADD REPLYlink 11 months ago
lokraj2003
• 70
1
Entering edit mode

Which poxvirus are you sequencing? I do this all the time with Vaccinia, except I use HISAT and StringTie to quantify transcript levels. Which ever poxvirus you have, you can always extract the CDS regions and treat that as the transcriptome, not a lot of splicing happening in poxviruses....

ADD COMMENTlink 11 months ago Adrian Pelin ♦ 2.3k
Entering edit mode
0

I am using Orf virus. Is there any script to combine virus CDS to host genome or you do it manually ? Also, what do you do after getting reads counts ? I mean what tool do you use to separate viral and host reads ?

Thank you

ADD REPLYlink 11 months ago
lokraj2003
• 70
Entering edit mode
0

I would use bbsplit to separate the host (sheep?) reads and virus reads. You can map to multiple references. The command below will create files for ecoli reads, salmonella reads as well as reads that don't map to either.

bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq
ADD REPLYlink 11 months ago
Adrian Pelin
♦ 2.3k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0