RealignerTargetCreator (GATK) contig issue
1
0
Entering edit mode
6.3 years ago
win ▴ 970

So i downloaded a .bam from phase 3 1000 genomes and i want to run the RealignerTargetCreator on it using GATK 3.8.

The following command works fine:

sudo java -jar algorithms/gatk/gatk3.8.jar -T RealignerTargetCreator -R references/hg38.fasta -I data/HG100.reordered.bam -o data/HG100.realignertargetcreator.intervals

But when this command is run with -know i get contigs mismatch error between my bam and the vcf file. The know indel VCF used is located here:

https://storage.googleapis.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz

Step prior I have applied the picard ReorderSam using hg38.

Am i using the incorrect indels file?

Thanks in advance.

GATK • 2.5k views
ADD COMMENT
0
Entering edit mode
6.3 years ago

check the names, the length and the order of the contigs in

$ wget -q -O - "https://storage.googleapis.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.known_indels.vcf.gz" | gunzip -c | grep '##contig'

is the same as

$ samtools view -H data/HG100.reordered.bam | grep '^@SQ'
ADD COMMENT
0
Entering edit mode

Thanks, in the know_indels.vcf.gz i can see a lot of HLA-A, HLA-B, HLA-C contigs, where is not in the reordered BAM. Can this be an issue?

ADD REPLY
0
Entering edit mode

i can see a lot of HLA-A, HLA-B, HLA-C contigs, where is not in the reordered BAM. Can this be an issue?

yes

ADD REPLY
0
Entering edit mode

........and how do take care of this issue?

ADD REPLY
0
Entering edit mode

change the sequence dictionary in the bam (insert the missing lines using samtools reheader ) or remove the ##contig lines and the variants in the vcf.

ADD REPLY
0
Entering edit mode

OK, will try. I have never done any of this before.

ADD REPLY
0
Entering edit mode

Consider @Devon's instructions as you try samtools reheader: A: Problems while reheadering BAM with Samtools 1.3.1

ADD REPLY

Login before adding your answer.

Traffic: 2898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6