How to perfectly use transcript merge method of various software like stringtie?
1
0
Entering edit mode
5.4 years ago

As per the manual StringTie can be used with --merge in order to generate a non-redundant set of transcripts observed in all the RNA-Seq samples assembled previously. The stringtie --merge mode takes as input a list of all the assembled transcripts files (in GTF format) previously obtained for each sample, as well as a reference annotation file (-G option) if available. I have a few questions on this.

  1. I have 60 samples from Rat, 10 different organs and has 6 technical replicates for each organ. What is the best way to --merge? all the 60 samples at once or separate --merge for each and every organ.
  2. What is the significance of -- merge with an example.
  3. what is the next step after merging?

Thanks

RNA-Seq next-gen • 2.5k views
ADD COMMENT
4
Entering edit mode
5.4 years ago

I will try to answer:

Q1) If you can run all samples in one go (and have equal number of samples in each organ) I would do that. Else you can follow the approach used in the recent CHESS paper where the authors of StringTie merged first within organ and then afterwards across organs (due to computational limits)

Q2) I interpret this as why do you want to merge. You want to merge because in the end you want to have all the same set of transcript/genes quantified in all your samples. This is necessary both because: 1) else you do not know which transcript correspond to which transcript in two different samples - else you cannot compare the two samples. 2) If it is not the same set of transcripts quantified you introduce a systematic bias in both samples.

Q3) You need to re-quantify all your samples using the combined transcriptome. Follow the instruction/guide here and you should be fine.

Note that:

  1. If you want to do differential expression I suggest using tximport to get the data into R rather than the script StringTie supply via their homepage. Using tximport you can follow this DE analysis guide.
  2. If you are interested this data can also be used to identify and analyze isofom switches with predicted functional consequences with my R package IsoformSwitchAnalyzeR. For examples of what type of analysis can be done take a look at this section of the vignette.
ADD COMMENT
0
Entering edit mode

Thank you for this helpful explanation and suggestion too.

ADD REPLY
0
Entering edit mode

No problem. If you like it you can always give it a thumbs up :-)

ADD REPLY
0
Entering edit mode

Sorry, I just forgot, I would love to do that. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 3019 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6