Dear All,
I’m starting using cufflinks after the TopHat2 alignment. Looking at the manual I can see that there are several options to add to this tool, but in particular what is not clear to me is if I have to use the trascriptome or the genome as reference. I run the program using this command line
cufflinks -p 4 -N -g Homo_sapiens.GRCh37.75.gtf file.bam -o cuffresults
As you can see, I used the reference genome annotation,is tat correct? I’m also wondering if I can use a trascriptome annotation downloaded from ensemble (Homo_sapiens.GRCh38.cdna.all.fa.gz) or I have to build it from the reference genome ?
Thank you
Please use the "question" post type if you are asking questions, and not "Forum". I have converted your post for you this time (and your previous posts...), but please keep this in mind for further posts.
You should probably not be using TopHat2 - see this tweet from Lior Pachter one of the creators of TopHat and Cufflinks.
Hi Kristoffer, thank you for the indicated racomandation, at the moment my aim is to be more familiar with codes and commands used in tools for RNA seq analysi. I'm taking in consideration also HISAT and to be honest currently in my winning list there is featureCounts.
Even if you just want gene-expression I would recommend doing the transcription level quantification - it gives more accurate gene-level estimates - see this blog. For considerations regarding transcription level quantification check this section of my vignette.
Btw even if you decide on usingstrong text featureCounts you still need files mapped to the genome - so the Hisat run is still needed :-)