Aligning reads to genome together with transcriptome
2
1
Entering edit mode
8.2 years ago
EVR ▴ 610

Hi,

I am new to Genome alignment. I have mRNA reads, de novo trasncriptome and genome. I would like to map the reads to genome together with transcriptome and later I want to find the location of certain transcripts in genome. For an example, I would like to know the location in transcript A in genome and number of reads mapped to that transcript A in that location of genome.

How can it be done. Kindly guide me.

RNA-Seq Genome Transcriptome alignment • 3.0k views
ADD COMMENT
0
Entering edit mode

which species you are studying ? If you have the genome you might have also the associated gene annotation. Thus you could align the reads against the genome and count the reads for each transcript using featureCounts (for example).

ADD REPLY
0
Entering edit mode
8.2 years ago
michael.ante ★ 3.8k

Hi Tom,

For the start, I'd like to refer you to the Tophat2 / Cufflinks protocol paper. Most of the aligners like Tophat2, STAR, BBMAP, HiSAT .... take your mRNA-Seq reads and align them to the genome and include exon-exon junctions.

Afterwards, you need software like Cufflinks, StringTie, or Mix2; to estimate the transcript abundances.

Cheers,

Michael

ADD COMMENT
0
Entering edit mode
8.2 years ago
iraun 6.2k

As @NicoBxI has pointed out, the most straightfoward way is to check if your genome has an annotation file (gtf, gff3 file...) available and published. If the answer is yes, in this file you'll see the coordinates of the transcripts, and you could mapp the reads against the genome and quantify the number of reads associated to each transcript using featureCounts software with the annotation file . In the case that the genome has not been annotated yet, I'll try to make an annotation file using the transcripts of the transcriptome. Using for example PASA software, you can get a gtf/gff3 file giving the genome and the transcriptome as input.

ADD COMMENT
0
Entering edit mode

Hi,

I am working on non-model organism and it has genome and associated gff3 file. But the de novo assembled transcriptome has different transcripts name obtained from trinity. For an example,genome gff3 file has scaffold2353,scaffold3667 etc and my transcriptome has header like mm_tr_v3_1789, mm_tr_v3_198, etc.

Also there is no word called "transcript" defined in gff3 file for my genome. In this situation how could I map this transcriptome find out the location of particular transcript(say mm_tr_v3_1789) in the genome.

Please guide me. thanks in advance.

ADD REPLY
0
Entering edit mode

Hi EVR,

What approach did you finally use for your analysis, would be really good to share?

Thanks

ADD REPLY

Login before adding your answer.

Traffic: 2516 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6