Does read group id need to be unique between multiple bam files?
0
0
Entering edit mode
22 months ago
antmantras ▴ 80

Hi.

I want to do a variant calling analysis with some WGS data. I did not provide read groups in the alignment step so I want to provide now with GATK AddOrReplaceReadGroups for all the bam files obtained. The majority of samples have only one bam file generated by the alignment of their paired read data against the reference, but several ones have 2 because the sequencing service generated more reads a posteriori. So the provided names of the fq files are something like this:

Sample16_EDSW220011991-3a_HJWTGDSX3_L3_1.fq.gz

Example read tag in fq file: @A00709:345:HJWTGDSX3:3:1101:1127:1000 1:N:0:ATCGCTTG+NTTCGTAC

Sample16_EDSW220011991-3a_HJWTGDSX3_L3_2.fq.gz

Sample17_EDSW220011992-3a_HJWTGDSX3_L3_1.fq.gz
Sample17_EDSW220011992-3a_HJWTGDSX3_L3_2.fq.gz
Sample17_EDSW220011992-3a_HL5LKDSX3_L3_1.fq.gz
Sample17_EDSW220011992-3a_HL5LKDSX3_L3_2.fq.gz

I assigned read groups for each single bam and, if there was more than two bam files from one sample, I assigned the read groups per each bam file according to the command shown below:

gatk AddOrReplaceReadGroups -I $inputbam -O $outpath/$inputbam -RGID $rg -RGLB $sample"_lib" -RGPL illumina -RGPU $rg$sample"_lib" -RGSM $sample

Where $rg is (if I understand correctly) the flowcell ID which I obtained by concatenating the flowcell ID and the lane, i.e. HJWTGDSX3.3 for the first sample (Sample16), and $sample is the name of the sample (Sample16 in this case). Then, bam files belonging to one specific sample were merged into one.

Could this labeling cause conflict when calling variants with bcftools or gatk? Should RGID be unique across all samples to produce a VCF file from multiple samples? Note how with this labeling, the RGID from sample 16 and some of the reads from sample 17 have been labeled in the same way. Are variant calling tools capable of distinguishing these reads based on the -RGSM field?

Thanks in advance.

bam variant-calling snp readgroup • 439 views
ADD COMMENT

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6