Adaptor trimming issue
1
2
Entering edit mode
6.5 years ago
1769mkc ★ 1.2k

I am doing adaptor trimming ,its illumina universal adaptor , I using cutadapt to trim the adaptor sequence.

Alignment without adaptor trimming

Left reads:
          Input     :  46627933
           Mapped   :  29928631 (64.2% of input)
            of these:  11992814 (40.1%) have multiple alignments (8801 have >20)
Right reads:
          Input     :  46627933
           Mapped   :  29469536 (63.2% of input)
            of these:  11724006 (39.8%) have multiple alignments (8688 have >20)
63.7% overall read mapping rate.

Aligned pairs:  28562130
     of these:  11404825 (39.9%) have multiple alignments
                  155607 ( 0.5%) are discordant alignments
60.9% concordant pair alignment rate.

Alignment after adaptor trimming

Left reads:
          Input     :  46624601
           Mapped   :  44668679 (95.8% of input)
            of these:  29803908 (66.7%) have multiple alignments (56945 have >20)
Right reads:
          Input     :  46624601
           Mapped   :  43936907 (94.2% of input)
            of these:  29470503 (67.1%) have multiple alignments (56649 have >20)
Unpaired reads:
          Input     :       226
           Mapped   :       181 (80.1% of input)
            of these:        95 (52.5%) have multiple alignments (0 have >20)
95.0% overall read mapping rate.

Aligned pairs:  42110389
     of these:  28673148 (68.1%) have multiple alignments
                34996114 (83.1%) are discordant alignments
15.3% concordant pair alignment rate.

Cutadapt command that used , reference

cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT -o HL60_trimmed1.fastq -p HL60_trimmed2.fastq FRED_6_150224_BC6BK7ANXX_P1881_1001_1_123bp.fastq FRED_6_150224_BC6BK7ANXX_P1881_1001_2_123bp.fastq

I m not able to get it how come after trimming of adaptor the concordant rate goes down ?

Any suggestion or help would be highly appreciated .

alignment • 2.2k views
ADD COMMENT
0
Entering edit mode

You have a large amount of multi-mappers. What kind of dataset is this?

ADD REPLY
0
Entering edit mode

its a HL60 data set .

ADD REPLY
0
Entering edit mode

and RNAseq? if so, did you enrich your RNA samples?

ADD REPLY
0
Entering edit mode

im not sure about enriching RNA sample , could you explain it ?

ADD REPLY
0
Entering edit mode

One of the reasons of having multi-mappers in your dataset is presence of rRNA in reads. I think @cpad0112 is asking if you know if these were removed by ribo-depletion or some mechanism enriching transcripts that are of actual interest.

ADD REPLY
0
Entering edit mode

okay my fastqc results only shows...illumina adaptors ,rest all i dont see anything .

But for this "presence of rRNA in reads" i am not sure if that is the case, but i would like to know how to check that is there a way ?

ADD REPLY
1
Entering edit mode

See rRNA detection (for contamination) in RNA-seq and threads linked from it.

ADD REPLY
0
Entering edit mode

okay i will look into it , but do you think that is the only issue which is lead to low discordant pair ?

ADD REPLY
0
Entering edit mode

It is one of the possibilities. I am not sure what aligner you are using but if it needs you to provide insert size as one of the parameters are you providing a number that reflects actual distribution in your data?

ADD REPLY
0
Entering edit mode

i used tophat2 as my aligner

ADD REPLY
0
Entering edit mode

You are processing PE data, you can use AfterQC (https://github.com/OpenGene/AfterQC) to cut adapters without the need of giving the adapter sequences.

Just run:

python AfterQC/after.py -1 read1.fq -2 read2.fq
ADD REPLY
1
Entering edit mode

Moving to a comment since this is not addressing OP's question of why % concordant alignment is decreasing after trimming of adapters.

ADD REPLY
7
Entering edit mode
6.5 years ago

Normally when the concordant rate decreases dramatically after adapter-trimming it indicates that pairing was broken, which can happen if the files are trimmed independently and some reads were discarded. You could try running BBMap's reformat.sh like this:

reformat.sh in1=trimmed1.fq in2=trimmed2.fq vpair

...to verify that, according to the read names, the reads are still properly paired after trimming. I don't see anything wrong with your trimming command, though.

In this case, I think the problem might be Tophat2/Bowtie2 calculating concordant pairs incorrectly. Before trimming, reads with adapters (which fully overlap) probably just did not map at all. After trimming, the new reads that map would map 100% overlapping; and indeed a lot of the reads that mapped previously with a few mismatches at the end (and not fully overlapping because of the adapter overhang) would also now map fully overlapping. Perhaps your version of Tophat2/Bowtie2 does not consider that concordant. I suggest you try a different aligner such as BBMap or Star and see what it reports for the concordance rate.

ADD COMMENT

Login before adding your answer.

Traffic: 2407 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6