Trimming reads of Chipseq samples
0
0
Entering edit mode
5.8 years ago
GK1610 ▴ 110

I am using trimmomatic to trim out the adaptor sequences from chip-seq fastq files. I tried Adaptor 1 (TruSeq3-PE-2.fa) file which is default in trimmomatic software and I get 95.77% of both survived reads whereas when I use the adaptor file (A2 see below) with overrepresented sequences I get 80.45% of BothSurviving reads. The dropped pct is ~ 1% in both cases and fastqc for both samples show < .04% of adaptor content. 4% of overrepresented sequences comes out for Sample_001.with adaptorA1.fastq.gz whereas no overrepresented sequences in Sample_001.with adaptorA2.fastq.gz

My question is, when I run the alignment later, is it gonna make it worse by have less % of both surviving reads or when call peaks later I will get less peaks? Which one I should go for?

Filename    InputReadPairs  BothSurviving   ForwardOnlySurviving    ReverseOnlySurviving    Dropped
Sample_001.with **adaptorA1**.fastq.gz  101005952   96732909 (95.77%)   2347954 (2.32%) 1332466 (1.32%) 592623 (0.59%)
Sample_001.with **adaptorA2**.fastq.gz  101005952   81260458 (80.45%)   1564086 (1.55%) 16804902 (16.64%)   1376506 (1.36%)

Adaptor 1:TruSeq3-PE.fa

>PrefixPE/1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PrefixPE/2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT

Adaptor 2:TruSeq3-PE.fa with overrepresented sequence

>PrefixPE/1
TACACTCTTTCCCTACACGACGCTCTTCCGATCT
>PrefixPE/2
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT
>TruSeqAdapterIndex1
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex2
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex3
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex4
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex5
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex6
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex7
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex8
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex9
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex10
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex11
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG
>TruSeqAdapterIndex12
5' GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG
ChIP-Seq alignment • 2.1k views
ADD COMMENT
1
Entering edit mode

If the data that is being trimmed is adapter sequence, it does not belong to your samples and should not be there when you do the analysis anyway.

ADD REPLY

Login before adding your answer.

Traffic: 2619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6