Low coverage for bases near the end of target reference sequence
0
0
Entering edit mode
5.7 years ago
kspata ▴ 80

Hi All,

I have viral samples sequenced using NextSeq PE 150 with over 90 million reads, I aligned these reads after trimming adapters and low-quality bases to a reference sequence (around 2750bp length). I noticed the bases at the end of the target sequence (2735bp - 2750bp) have very less coverage (equal to 0X or less than 10X) while the average per base coverage is 16700X, I used bowtie2 local alignment with default parameters.

Why am I getting low coverage near the end of target sequences? How can I improve coverage for this region?

Thanks in advance!!!

alignment bowtie2 sequencing • 1.2k views
ADD COMMENT
0
Entering edit mode

Is this a RNA virus? Did you start sample prep from RNA?

ADD REPLY
0
Entering edit mode

I believe it is a DNA virus as sample prep was done using complete DNA purification kit.

ADD REPLY
0
Entering edit mode

Is the drop in coverage sudden or smooth?

Never mind that, if it is just the last 15 bases, it is a sudden drop. In addition to genomax suggestions, it could also be an artifact. Are there similar viruses with the genome sequenced? Did you blast your whole virus sequence on NCBI?

ADD REPLY
0
Entering edit mode

Two things to consider:

  • This may be a problem with aligners not being able to map reads (which may be much longer) to last 15 bp of the reference.
  • You may have discarded reads with the last 15 bases during your trim/cleaning process.
ADD REPLY
0
Entering edit mode

Thank you for response. I did a two level trimming (trim_galore and sickle) which resulted in loss of reads aligning at the end of target sequence. I re-trimmed the raw reads with trim_galore only and got less than 10X coverage at ends of the target sequences. The coverage analysis still shows 0X coverage at base position 2749 -2750.

How can I align partial reads only to the ends of target reference sequence to get more than 0X coverage? Will this approach work and how can I use bowtie2 to do this?

ADD REPLY
0
Entering edit mode

Is the genome of your virus circular or linear?

How are you looking at the coverage?

ADD REPLY
0
Entering edit mode

kspata : Have you considered the possibility that your viral strain has a small deletion at those two base pairs? So what you are seeing is real for your strain.

ADD REPLY

Login before adding your answer.

Traffic: 2049 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6