Question: Annotate genomic positions with dbSNP rsIds
0
Entering edit mode

Although I already found some ways to annotate genomic positions with rsIDs using e.g. UCSC table browser, I'm not happy with that since I want a one-in-all linux script taking also strand issues (flipped alleles A-T vs- T-A or switched reference alleles) into account.

What I have:

chr position ref alt
10  169560   G   T
10  171117   G   A
10  171126   G   A
10  172995   A   C
10  178499   C   T

What I want:

chr position ref alt rsID
10  169560   G   T   rsXXX
10  171117   G   A   rsXXX, rsXXX
10  171126   G   A   rsXXX
10  172995   A   C   rsXXX
10  178499   C   T   rsXXX

Thanks

2
Entering edit mode

I will write down my solution as an answer for documentation purposes. I started as Pirerre recommended, but then I used bcftools instead of GATK.

First, I created a header .txt file for the custom vcf file

##fileformat=VCFv4.0
##fileDate=09052019
##source=allchr_allvsall_sex_adjusted
##reference==GRCh37.p13
##phasing=partial
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO

Then I used awk to generate the data for vcf according the specifications (8 columns). Setting ID="." == missing, Quality to 100 and PASS for the filter for all positions. Of note my_chr_pos_alt_ref.out.gz data consists only of autosomal SNVs!

zcat my_chr_pos_alt_ref.out.gz | awk '{print $1, ".", $2, $3, $4, 100, "PASS", "AA="$3}' OFS='\t' > tmp.vcf

add the header

cat header.txt tmp.vcf > mydata.vcf
rm tmp*

zipped and indexed

bgzip mydata.vcf
tabix -p vcf mydata.vcf.gz

Finally annotated rsIDs using:

bcftools annotate \
-a 00-common_all.vcf.gz \
-c ID mydata.vcf.gz \
--output-type z \
-o mydata_dbSNP151.vcf.gz

dbSNP files from ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/

ADD COMMENTlink 8 months ago Jimbou • 690
0
Entering edit mode

use awk to convert to vcf and then use gatk VariantAnnotator https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_annotator_VariantAnnotator.php with --dbsnp

ADD COMMENTlink 8 months ago Pierre Lindenbaum 120k
Entering edit mode
0

Thanks a lot. Started as you recommended, but switched to bcftools in the end.

ADD REPLYlink 8 months ago
Jimbou
• 690

Login before adding your answer.

Powered by the version 1.8