bcf tools fixref error
2
0
Entering edit mode
3.2 years ago

Hello,

I am trying to run the fixref plugin found here, to correct for snpflip errors in my topmed imputation. I found this code here: https://samtools.github.io/bcftools/howtos/plugin.fixref.html

for i in {1..22}
do
bcftools norm --check-ref e -f $OUTDIR/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchrfa.fa $OUTDIR/DAC14_send_to_topmed/DAC14_chr$i\_hg38_nonduplicates.vcf.gz -Ou -o /dev/null

As my build is hg38 and I need to keep the chr prefix in my reference file, I decided to use the GATK HG38 Build called: Homo_sapiens_assembly38.fasta found here: https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0;tab=objects?prefix=&forceOnObjectsSortingFiltering=false

I keep receiving this error message:

Failed to load the fai index: /sc/arion/projects/psychgen2/MAP2_dac/data/imputation/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchr.fasta [E::fai_build_core] Format error, unexpected "<" at line 2

I cannot seem to find the solution to this error.

snpflip bcftools topmed fixref imputation • 1.3k views
ADD COMMENT
0
Entering edit mode
3.2 years ago

The message is quite clear to me. Your fasta sequence is NOT a fasta sequence.

Format error, unexpected "<" at line 2

Show us the output of

head -n 3 /sc/arion/projects/psychgen2/MAP2_dac/data/imputation/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchr.fasta 
ADD COMMENT
0
Entering edit mode

interesting.... perhaps I downloaded it incorrectly?

 head -n 3 /sc/arion/projects/psychgen2/MAP2_dac/data/imputation/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchr.fasta

<!DOCTYPE html>
<html lang="en">

I used this to download...perhaps cannot do this from a cloud?

wget https://storage.cloud.google.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta?_ga=2.192584233.-289538108.1613068749

Also, I used this as I simply need a reference genome for running the fixref plugin that uses hg38 and chr prefix. Supposedly this is the main one available from GATK? In this same cloud, this was the only available fasta file...others were vcf.gz which couldn't work for this fixref script.

ADD REPLY
1
Entering edit mode

you downloaded the web page...

from cloud.google.com , I think you need to download it from the browser.

ADD REPLY
0
Entering edit mode

Yes, you're right.

So that any other rookies don't make this mistake, use:

wget https://storage.googleapis.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta

^ This one worked for some reason...

ADD REPLY

Login before adding your answer.

Traffic: 2705 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6