Calling large indels from NGS
2
0
Entering edit mode
8.5 years ago
Adrian Pelin ★ 2.6k

Hello,

I have population sequencing (125bp SE) of an organism, and I have managed to find large indels spanning several kb with bbmap tools. See here how the data looks when the .sam file is visualized.

http://imagebin.ca/v/2KVNpMvDWJCw

It looks like some might even be supported by 2 reads or more (same start same stop).

Is there anyway to call these? What software would you recommend? I tried freebayes, but I can't get any calls for some reason.

Adrian

EDIT:

I tried the following commands with samtools:

samtools mpileup -d 250 -m 1 -E --BCF --output-tags DP,DV,DP4,SP -f RefAnnotated_SPades_HXE-3.fasta -o mapped_AdaptTrim_trial1.bcf mapped_AdaptTrim.sort.bam
bcftools index mapped_AdaptTrim_trial1.bcf
bcftools call --skip-variants snps --multiallelic-caller --variants-only -O v mapped_AdaptTrim_trial1.bcf -o mapped_AdaptTrim_trial1_vcf.vcf

However, the detected indels are not too long, maximum 50-100bp. The ones I am seeing in the image posted are a few kb in length.

NGS large-indels samtools • 3.8k views
ADD COMMENT
0
Entering edit mode

Do those long thin lines represent deletions? If so, they mostly look like false mappings. Having just two supporting reads amongst those many false hits is too weak to be useful.

ADD REPLY
0
Entering edit mode

Yes they are deletions. In one case, there are 14 reads supporting one 11kb deletion (different sample). I got freebayes to call long deletions, however, the problem that I am running into is that if freebayes called a deletion from position X to position Y, it will not call a smaller deletion that is located between positions X and Y, even though that one might have more reads supporting it. I just need something to call them, then I can try PCR.

ADD REPLY
0
Entering edit mode

You'd better try lumpy/delly first, using their preferred mappers. They look at split reads as well as read pairs. They are peer-reviewed and independently evaluated. I know too little about bbmap split read to give any useful advice.

ADD REPLY
0
Entering edit mode

Will do, however, at least lumpy takes as input bwa-mem. I tried it, and it only find deletions that are maximum 20bp in 125bp reads under default parameters. Is it safe to tweak them, or do lump/delly take into account unmapped reads as well?

ADD REPLY
0
Entering edit mode

bwa-mem will produce two hits for split reads. At least lumpy uses this information. Giving two hits is more general. You can't put inversions or translocations in one sam line.

ADD REPLY
0
Entering edit mode

Ok fair point. I just successfully ran lumpy, and it identified 2 candidates over 10kb, one supported by 4 split reads, and one supported by 20 or so split reads and a few PE reads.

Is there any way to run it only on SE reads (as to only look for split reads)? My fragment length was ridiculously low, I ended up merging most of my reads and treating them as SE. I suppose I could trick the software by creating fake PE reads (reverse compliment my SE reads and call them R2), but I hate doing that.

ADD REPLY
0
Entering edit mode

Delly does support SE reads for InDel detection since version v0.7.1.

ADD REPLY
1
Entering edit mode
8.5 years ago
h.mon 35k

Subread can detect indels up to 200bp, according to its user guide.

Maybe you will have better luck with CNV detection software? People on my lab use CNVnator and they seem to like it.

ADD COMMENT
0
Entering edit mode

Since the coverage of these indels is so small, sometimes just a few reads out of a 3000x genome coverage, I am afraid CNV-related software will not pick up these differences in coverage. I am wondering if I can use some sort of RNA prediction of introns for this.

ADD REPLY
0
Entering edit mode
8.5 years ago
nchuang ▴ 260

MELT should be released soon and that was featured in the recent Nature paper. Maybe it's out already?

ADD COMMENT
0
Entering edit mode

Can't find it anywhere, seems a lot of tools designed for evaluating DNA melting temperatures start with or have melt in them.

ADD REPLY

Login before adding your answer.

Traffic: 1505 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6