What is the output of rna-seq alignment to reference transcriptome (FASTA+GTF) look like in SAM format?
1
0
Entering edit mode
3.0 years ago

Hello I struggled with alignment step in rna-seq pipeline (my data is whole transcriptome of prokaryote by using direct RNA sequencing method with nanopore). The manuals and tutorials of alignment to reference transcriptome that I found were short read only. I have tried using minimap2 for alignment the RNA from nanopore FASTQ file to reference genome (FASTA) and gene annotation file (GTF) (this tools need convert the GTF,GFF3 to BED first). However, the SAM result I got is exactly the same to alignment to reference genome (FASTA file only). So my question is what is difference of output that they generate in SAM file between mapping with mapping to reference genome (FASTA) only and mapping with reference genome (FASTA) with gene annotation file (GTF or GFF). Thanks in advance!

RNA-seq aligment nanopore • 1.9k views
ADD COMMENT
0
Entering edit mode

can you share the command for alignment with minimap? by default the output is a SAM file

ADD REPLY
0
Entering edit mode

1.FASTA+GTF

paftools.js gff2bed [my.gtf] > [my.bed]

minimap2 -a -x map-ont --junc-bed [my.bed] [my.fasta] [my.fastq] > [my.sam]

2.FASTA only

minimap2 -a -x map-ont [my.fasta] [my.fastq] > [my.sam]

ADD REPLY
0
Entering edit mode

There is a published pipeline from ONT that you should consider: https://github.com/nanoporetech/pipeline-transcriptome-de

Leaving this link here as a reference for future visitors.

ADD REPLY
0
Entering edit mode
3.0 years ago

Since prokaryotes show little if any splicing, the RNASeq alignment will not look much different than DNA alignment. But you can use that gtf for gene counting.

ADD COMMENT
0
Entering edit mode

Thank you. Do you have any recommend tools for long read alignment that generate output file as GTF/GFF file? I'm not sure that hisat2 and stringtie2 are work for long read data. Thanks in advance :)

ADD REPLY
0
Entering edit mode

Do you have any recommend tools for long read alignment that generate output file as GTF/GFF file

Output of an alignment is a SAM/BAM format file and not GTF/GFF. GTF/GFF files contain known information about gene models that is used along with a SAM/BAM alignment file to do read counting.

ADD REPLY

Login before adding your answer.

Traffic: 2366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6