How can I filter out variants with R2!="." in the INFO column?
bcftools view -i 'INFO/R2!="."' -Oz $FILE -o $OUTFILTERED
The above generated a vcf.gz file with only the header.
I have used the same command line to filter VCF file based on the R2 score and it worked fine. Just wondering what is missing here that I don't get any output after filtering?
Here's more detailed information if you are interested. I filtered the R2 score after imputation using bcftools, then used vcftools to remove specific individuals, however, vcftools split multiallelic sites so there are rows without R2 score. The downstream analyses I am using is the DosageConvertor, which requires the R2 information.
bcftools view -e '(R2<0 || R2>=0)' chr22_rsq_filtered.vcf_remove12.gz.recode.vcf.gz | head -19
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##filedate=2018.11.9
##source=Minimac3
##contig=<ID=22>
##FILTER=<ID=GENOTYPED,Description="Marker was genotyped AND imputed">
##FILTER=<ID=GENOTYPED_ONLY,Description="Marker was genotyped but NOT imputed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="Estimated Alternate Allele Dosage : [P(0/1)+2*P(1/1)]">
##FORMAT=<ID=GP,Number=3,Type=Float,Description="Estimated Posterior Probabilities for Genotypes 0/0, 0/1 and 1/1">
##INFO=<ID=AF,Number=1,Type=Float,Description="Estimated Alternate Allele Frequency">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Estimated Minor Allele Frequency">
##INFO=<ID=R2,Number=1,Type=Float,Description="Estimated Imputation Accuracy">
##INFO=<ID=ER2,Number=1,Type=Float,Description="Empirical (Leave-One-Out) R-square (available only for genotyped variants)">
##bcftools_viewVersion=1.9+htslib-1.9
##bcftools_viewCommand=view -i 'INFO/R2 > 0.4' -Oz -o /home/chr22_rsq_filtered.vcf.gz /home/chr22.dose.vcf.gz; Date=Tue Nov 13 11:51:58 2018
##bcftools_viewCommand=view -e '(R2<0 || R2>=0)' chr22_rsq_filtered.vcf_remove12.gz.recode.vcf.gz; Date=Wed Nov 28 13:29:28 2018
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 11784 9448 10003420 15472 10004782 10008417 10013040 19706 10010631 10011283 10012586 21206 21331 10903 7245 26461 19614 29232 10009360 9769 10010684 11365 13165 15117 2220 10006632 5607 10013521 22310 21492 10012059 10013319 17865 32454 15736 36987 17374 18284 27381 28469 34892 38216 12431 15518 10013520 10005321 11712 10009688 10008150 10002649 5636 4016 2319 15598 12784 10008181 33183 17191 16638 10004678 9663 24877 40681 33313 10000724 12484 9694 977 9752 8875 11258 10010739 10005063 10009709 22920 11089 10000267 11114 27787 18302 34238 18528 18252 13698 22380 35748 35622 35901 10010241 10001457 10010552 10008477 10010189 14321 10011307 10008227 14149 13863 15355 15028 10003971 10001660 20541 27937 23084 21668 10006375 10013305 10007238 10012905 10012018 3561 3579 12362 100013910007005 10011320 10007334 10007731 10008955 10000728 10003319 10003342 10008459 10008003 10006200 10012133 10012276 10000531 10005851 10006611 10012678 22569 17213 24159 10003789 10001994 10007567 803 10000908 10011711 10001612 10001655 10001895 10009034 10009661 10007125 10001236 10001147 10011709 10010519 10010534 10010939 5623 10010473 10002558 10001012 10008536 10007207 100066310003886 10004177 10004145 15131 509 10003452 5648 6340 11710 10868 26833 27020 14488 2372 10803 16171 10000051 10013524 10009852 10012710 10009106 10009604 12280 11420 490 10000542 10009464 10006058 10012203 10011005 26821 10003499 10003867 10002130 10005764 10007018 10005870 10004760 10011645 10011492 10010801 16886 32744 10001309 6335 10003562 10006612 10005023 10007903 10000929 10008048 10608 10012425 10012597 10005211 10005601 10005216 23199 12970 39819 36561 10011522 38108 8102 5223 9469 36338 26681 23585 10000337 10007488 100083010006039 32018 10007070 10011635 31753 10009658 10002325 10008960 10002904 27978 28062 16717 20316 21109 20926 10006048 10004629 5580 13617 33114 10010797 10003512 10000426 10009060 10007611 10002006 10007869 11601 10010407 10010504 10009190 10004334 10004872 14675 14121 12378 20865 20620 36557 10010674 10001912 27209 13447 24725 12201 10003216 10000244 10003199 10002793 10004249 10000013 10008067 31881 10012837 10003500 10010309 13741 10006974 10009990 39732 16789 16431 10010230 10004136 10007568 10010859 10012720 10011862 10003148 10003214 10003187 11476 7874 8521 10012881 8943 21359 7446 32606 21484 10003729 21241 19659 18649 18052 16976 339 22111 5021 22456 100087110009133 10012368 10012435 10003227 15108 10012818 10013497 13708 26469 10533 36149 10004212 10006972 21997 12807 28243 19275 7619 10001475 10007538 10002767 10006290 10007388 10011253 10010756 10000274 10006647 10005810 10006185 4992 2175 11404 11786 14915 13263 16445 16347 16813 10011693 13347 22291 18895 20495 20697 20842 10010541 25576 15932 11991 12063 12707 10003189 859 829 8292 10009906 10001547 10008374 10008437 10008986 10004717 10005306 10006195 10011881 10011760 10010392
22 16578328 22:16578328 A G . PASS . GT:DS:GP 0|0:0.033:0.968,0.032,0 1|0:0.812:0.201,0.786,0.013 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|1:0.811:0.201,0.786,0.012 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 1|0:0.808:0.205,0.782,0.013 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|1:0.808:0.205,0.783,0.012 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|1:0.809:0.204,0.783,0.013 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|1:0.808:0.205,0.783,0.013 1|1:1.583:0.043,0.33,0.627 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|1:0.808:0.204,0.783,0.012 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 1|0:0.808:0.204,0.783,0.012 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,00|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,00|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|1:0.808:0.204,0.783,0.013 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.969,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.033:0.968,0.032,0 0|1:0.809:0.204,0.783,0.013 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 1|0:0.81:0.203,0.784,0.013 0|0:0.033:0.968,0.032,00|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,00|0:0.031:0.969,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,00|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,00|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,00|0:0.033:0.968,0.032,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,00|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,00|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|1:0.812:0.201,0.786,0.013 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 1|0:0.814:0.199,0.788,0.013 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|1:0.809:0.204,0.783,0.013 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,00|0:0.033:0.967,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,00|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,00|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.031:0.969,0.031,00|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,00|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,00|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.967,0.032,0 0|0:0.033:0.968,0.032,00|0:0.032:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 1|1:1.587:0.043,0.328,0.63 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.031:0.969,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.031,0 0|0:0.033:0.968,0.032,0 0|0:0.032:0.968,0.032,0 0|0:0.033:0.968,0.032,0 0|0:0.031:0.969,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.968,0.031,0 0|0:0.032:0.969,0.031,0 0|0:0.033:0.968,0.032,0
Hello Molly_K ,
could you please post a small example of your vcf file (including the header) with variants that should be kept and those that should be removed?
fin swimmer
Thanks for adding the example. I don't know if this is a copy&paste problem to biostars. But whatever bcftools command I try, I get the error message
[E::vcf_parse_format] Incorrect number of FORMAT fields at 22:16578328
Did you receive any messages like this? This would explain why you just get the header.
I've tried to find that wrong number of
FORMAT
fields, with this:The first column shows you the column number in the
vcf
file with wrong number of fields, the second column shows you how the wrong column looks like.fin swimmer
I apologize for the weird behaviour of the file. I think after copy paste from the terminal there are some tabs missing.
Here's a link to the first 100 rows of the vcf file that I extracted
https://www.dropbox.com/s/l4ic8bb1mwaf2qo/test2.vcf?dl=0
Hello again,
in this example file there are no variants that have a
R2
value in theINFO
column. So after filtering withbcftools view -i 'INFO/R2!="."'
you will get of course just the header as no variant fulfill the criteria.fin swimmer
can you try
Hi Pierre, here's the output [filter.c:2278 filters_init1] Error: the tag "GQ_MEAN" is not defined in the VCF header
I meant 'R2', fixed
Updated the original post to fit in the first 19 rows. It's the header + the first snp. If I don't limit the row number, it would keep running till the end of the file. Thanks.
Hi Pierre, I used the following command line and noticed a drop in the row numbers