Entering edit mode
7.3 years ago
ddzhangzz
▴
90
One of sequences in my RNASeq fastq file looks like:
@7001458:226:C989WANXX:3:1102:17546:38724 1:N:0:CGATGT
GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAACAAAAAAATAAGCAGAGTTGTCAAAGTAAAAACAAAACAAAAAATAATAAGAA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<<FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/<7////7/</</<///7///7//////7//7/7//77///////7///////
Because they were paired end, I cut adapters using cutadapt such like:
$cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC \
-A AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT \
-o $out1 -p $out2 \
$input1 $input2
And the output showed the full sequence has been cut out:
@7001458:226:C989WANXX:3:1102:17546:38724 1:N:0:CGATGT
+
There does have the adatper sequence (GATCGGAAGAGCACACGTCTGAACTCCAGTCAC) in the RNASeq sequence but why the whole sequence has been removed to empty.
why? if manually remove the adapter
GATCGGAAGAGCACACGTCTGAACTCCAGTCAC
from the seq, the trimmed should beCGATGTATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAACAAAAAAATAAGCAGAGTTGTCAAAGTAAAAACAAAACAAAAAATAATAAGAA
?Adapter:
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
Read:
GATCGGAAGAGCACACGTCTGAACTCCAGTCAC CGATGTATCTCGTATGCCGTCTTCTGCTTGAAAAAA AAAACAAAAAAATAAGCAGAGTTGTC AAAGTAAAAACAAAACAAAAAATAATAAGAA
The part in bold is adapter. When adapter trimming, you trim the adapter and everything to the right. In this case, that's everything. The stuff to the right of the adapter sequence is not genomic.
Thanks! What do you mean not genomic? How do you know that?