Badly formed genome loc: Contig '20' does not match any contig in the GATK sequence dictionary derived from the reference
1
0
Entering edit mode
8.3 years ago
rubic ▴ 270

Hi,

I'm trying to call variants from human RNA-seq data with GATK.

I first run the SplitNCigarReads step using this command:

java -jar GenomeAnalysisTK.jar -T SplitNCigarReads -R GRCh37.fasta -I <bam_file> -o <splitN_bam_file> -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS

Then, I'm trying to realign around indels using this command:

java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R GRCh37.fasta -I <splitN_bam_file> -L 20 -o <realignment_targets_list_file>

which exits with this error:

##### ERROR MESSAGE: Badly formed genome loc: Contig '20' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?

In the output of the SplitNCigarReads all chromosomes in the fasta file are reported and a contig named '20' is clearly not there. The same is true for the dictionary created from the the fasta file using picard's CreateSequenceDictionary module, and in the fai file created by the samtools faidx option, which are required for indexing the genome fasta file.

Any idea?

gatk RNA-Seq • 4.9k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
fusion.slope ▴ 250

That's because you didn't adapt the command line that you copied (which includes an argument to limit the run to chromosome 20) to your reference. Our example commands use the b37 reference, which has numbers for contig names, so when they use an interval it looks like -L 20. But with hg19, which has 'chr' prepended to the contig numbers, it should look like -L chr20. (from GATK forum), I had the same problem, remove the parameter -L 20 and it should work

ADD COMMENT

Login before adding your answer.

Traffic: 3788 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6