add undefined annotations from vcf files
1
0
Entering edit mode
7.2 years ago
bioguy24 ▴ 230

Is there a way to find and add all undefined annotations in a vcf?

For example, if I have 3 vcf files in a directory, find all undefined annotations in each vcf, write them to a header file, then add that header file to the original vcf.

I guess something like bcftools reheader, but that seems to only work on individual vcf files.

 # find undefined annotations
 bcftools view -h file1.vcf > file1_header.txt

 # edit original vcf
 bcftools reheader -h file1_header file1.vcf > file1_fixed.vcf

then loop through the directory doing the same for file2 and file3. I have been trying to figure this out for a few weeks and not able to do so, is there a better way? I would then use bgzip and tabix on the fixed.vcf to further process. Thank you :).

bcftools • 1.9k views
ADD COMMENT
0
Entering edit mode

Thank you very much :).

ADD REPLY
3
Entering edit mode
7.2 years ago
 # find all annotations

grep -E "^##(INFO|FORMAT|INFO)=" file1.vcf | sort | uniq  > file1_header.txt


#add those annotations to the other vcf files. Hoping there is no synonymous ID (same header ID, not the same meaning)

for F in other*.vcf ;
do
#find annotations in F   
grep -E "^##(INFO|FORMAT|FILTER)="  ${F} | sort | uniq > tmp1.txt

#get missing annotations

comm -23 file1_header.txt  tmp1.txt > tmp2.txt

# add those missing annotation before the #CHROM line

grep  "^##" ${F} > ${F}.new.vcf
cat  tmp2.txt >> ${F}.new.vcf
grep -v "^##" ${F} >> ${F}.new.vcf

rm tmp2.txt
done

rm  file1_header.txt
ADD COMMENT
0
Entering edit mode

Thank you very much for your help :).

ADD REPLY

Login before adding your answer.

Traffic: 1734 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6