Biostar Beta. Not for public use.
Question: Relative transcript expression
0
Entering edit mode

Hi,

I have been trying to understand relative expression of two transcripts from a gene.

Let's, say I have a gene with 6 exon and it produces two transcripts: isoform 1 with all 6 exons. and isoform 2 with exon 1, 2, 3, 4 & 6.

I have bam files STAR and I don't want to do alignment again so I would really appreciate if anyone can suggest tool that will quantify these two isoforms.

Thanks in advance.

ADD COMMENTlink 21 months ago Govardhan Anande • 130 • updated 20 months ago WouterDeCoster 39k
Entering edit mode
0

Try miso as well.

ADD REPLYlink 20 months ago
cpad0112
11k
Entering edit mode
0

I have MISO results and as you know miso only consider alternative exon along with upstream and downstream exons but not entire transcript.

ADD REPLYlink 20 months ago
Govardhan Anande
• 130
1
Entering edit mode

Hi Govardhan

Basically, there are 2 steps

  1. The identification of the transcripts.
  2. Estimating the "relative" abundance of those transcripts in your sample.

When you say you have already have isoforms in hand, I believe that you are already done with the step#1.

So, if you have the bam files and the corresponding reference genome in hand, you can run stringtie to estimate the abundances (step#2)

In case if you are not yet done with step#1 then you will have to run stringtie 2 times as described below

  • first time with the bam files and the reference file to perform a "reference guided" transcriptome assembly.
  • taking the consensus set of transcripts from all samples as reference, you will have to estimate their abundance.

By abundance, I mean the FPKM or TPM values (or your favourite metric) which stringtie will generate for you.

NOTE: StringTie is part of the new tuxedo protocol.

ADD COMMENTlink 21 months ago Vijay Lakhujani 4.1k
Entering edit mode
0

Hi Vijay,

Thank you.

Yes, I have identified the transcripts and I have generated GTF file of two transcripts. Now I am trying to get the relative abundance but I getting "Error: could not any valid reference transcripts in Demo.gtf (invalid GTF/GFF file?)?

My gtf looks like : chrX protein_coding exon XXX507 XXX637 . + . gene_id "geneX"; transcript_id "isoX"; gene_name "geneX"; chrX protein_coding CDS XXX507 XXX637 . + . gene_id "geneX"; transcript_id "isoX"; gene_name "geneX"; chrX protein_coding exon XXX612 XXX724 . + . gene_id "geneX"; transcript_id "isoX"; gene_name "geneX"; chrX protein_coding CDS XXX612 XXX724 . + . gene_id "geneX"; transcript_id "isoX"; gene_name "geneX";

ADD REPLYlink 21 months ago
Govardhan Anande
• 130
Entering edit mode
0

Share the exact command for

  • mapping

  • and for this step (abundance)

ADD REPLYlink 21 months ago
Vijay Lakhujani
4.1k
Entering edit mode
0

Alignment

STAR --runMode alignReads --outSAMtype BAM SortedByCoordinate --runThreadN 10 --genomeDir $FastaIndex --readFilesIn $R1 $R2

I just started with basic one for abundance

~/stringtie-1.3.4d.Linux_x86_64/stringtie Aligned.sortedByCoord.out.bam -G Demo.gtf
ADD REPLYlink 21 months ago
Govardhan Anande
• 130
Entering edit mode
0

What is the output of this?

stringtie -G reference.gtf -o out.gtf sample.sorted.bam

reference.gtf = GTF file for the corresponding reference genome you are using

out.gtf = stringtie will generate for you

sample.sorted.bam = coordinate sorted bam file

This step is the assembly step. The out.gtf will have the information of the assembled transcripts.

Once you are done with this, the next step is abundance which I ll share later

ADD REPLYlink 21 months ago
Vijay Lakhujani
4.1k
Entering edit mode
0

Why do I need to use reference GTF when I can use gtf of two transcripts??

Is it something StrinTie requires?? and output of above command is GTF i.e. chrM StringTie transcript 1 16571 1000 . . gene_id "STRG.1"; transcript_id "STRG.1.1"; cov "20872.708984"; chrM StringTie exon 1 16571 1000 . . gene_id "STRG.1"; transcript_id "STRG.1.1"; exon_number "1"; cov "20872.708984" ;

ADD REPLYlink 21 months ago
Govardhan Anande
• 130
1
Entering edit mode

For quantification of transcripts you could also look at fast alignment-free approaches such as Salmon.

ADD COMMENTlink 20 months ago WouterDeCoster 39k
Entering edit mode
1

It's also worth adding here, that Salmon needs a tool called Wasabi, to make the output into a h5 structure, ready for differential isoform modelling in Sleuth

ADD REPLYlink 20 months ago
andrew.j.skelton73
5.7k
0
Entering edit mode

Why do I need to use reference GTF when I can use gtf of two transcripts??

A reference is required when you are performing "reference guided assembly". Information of the genomic features will be utilized from the reference GTF file. Are you trying to do a de novo assembly?

Is it something StrinTie requires??

Its optional, stringtie can perform denovo assembly.

ADD COMMENTlink 21 months ago Vijay Lakhujani 4.1k
Entering edit mode
0

I am working on Human samples, so I just need expression of each transcripts from one gene.

Thanks, Govardhan

ADD REPLYlink 21 months ago
Govardhan Anande
• 130
Entering edit mode
0

Did you try RSEM?

ADD REPLYlink 21 months ago
Vijay Lakhujani
4.1k
Entering edit mode
0

Again, the problem with RSEM is the alignment. I have bam files from STAR and they are not compatible with RSEM and same goes for cufflinks as well. Honestly I can't afford realignment so trying to find way to utilise what I have at the moment. Anyways, thank you for your help and time.

ADD REPLYlink 21 months ago
Govardhan Anande
• 130
Entering edit mode
0

If "time" is the concern, then you can try HISAT2 for alignment, but the call is yours! You're welcome. I ll be glad if you share the final thing that helped.

ADD REPLYlink 20 months ago
Vijay Lakhujani
4.1k
Entering edit mode
0

Govardhan, STAR is compatible with cufflinks. Please paste an error snippet if you get any so that I may help with it

ADD REPLYlink 20 months ago
Jeffin Rockey
♦ 1.1k
Entering edit mode
0

Jeffin, you are right cufflinks accepts the STAR bam files but results are different. I tried feeding one sample bam from TopHat and STAR. Anyways, I got splicing information from various tool and now I am planning use that value.

I posted this question here because I wish to compare entire transcript expression rather than alternative exon.

ADD REPLYlink 20 months ago
Govardhan Anande
• 130

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0