Hi, I am new to RNAseq and not a bioinformatician so please take my apologies if these are basic questions. After mRNA Illumina PE sequencing of 6 brain tissue samples (3 test, 3 controls), de novo assembly with Trinity and DEG with bowtie2 we got: 1. a high number of very similar contigs (putative isoforms). Strangely, in each cluster of isoforms some contigs would be significantly differentially expressed in one direction while other contigs would be significantly differentially expressed in the opposite direction. I don't understand how this is possible. Having more than 90% similarity, often >99%, and assuming reads that map perfectly multiple times are distributed randomly, shouldn't read counts between very similar contigs also be similar? The end result is that at the pathway analysis step we end up with DEG showing, simultaneously, up and down regulation (as a consequence of opposite counts for isoforms that have the same functional annotation). 2. a significant number of reverse complement sequences. In this case the counts are similar and point in the same direction. However, I don't understand how these reverse complement sequences end up in the unigene list
Since you are creating a de novo assembly I assume you are not working on an individual with a known reference genome?
Yes, it is a de novo assembly without a reference genome