Tag SNPs refers to a group of SNPs whose genotypes are predictive of other SNPs in their surrounding haploblocks. However, in some tagging experiments, one does not necessarily have to refer to 'haploblocks', and can instead just do a scan genome-wide for highly informative SNPs that define a particular group.
During my PhD, as a side project, I developed a method for identifying haplotype tagging CNVs for the purposes of distinguishing the 4 populations from the 270 International HapMap Project, but this was before 1000 Genomes data was even released and before R packages became very popular. Whilst saying that, technically, in my tutorial here on Biostars, I am defining tag SNPS on the 1000 Genomes Phase III data, and these tag SNPs are highly informative of each respective population group: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format (old)
In the tutorial, the tagging SNP method that I use is based on linkage disequilibrium and the calculation of the variance inflation factor (see the section entitled 'Prune variants from each chromosome'), whereby tagging SNPs are identified in SNP bins across the entire genome. In fact, you'll find that most tagging SNP methods are based on linkage disequilibrium metrics in some shape or form.
I am not aware of many implementations in R for tag SNPs. As mentioned in this previous answer, HaploView would be a good standalone choice: A: Measure Tag Snps, R Package, Tools
You could easily do both the method that I used and also export your data into HaploView for further interrogation. Hopefully you are familiar with how you can load data into these programs (be aware that plink has an export function for HaploView format).
Kevin
Thanks for you reply! I'm quite new to bioinformatics actually and am trying to familiarize with Haploview first. I have converted my binary (.bed, .bim, .fam) files to pedigree format (.ped, .map) via --recode in plink. I'am trying to upload the .ped file to Haploview Tagger. Any idea why I get this error?
Thanks again!
You'll need to post your full command.
Yes, are you running this on a cluster environment?
I am running it from the tagger service available online at this link:
http://archive.broadinstitute.org/mpg/tagger/server.html
I have not downloaded haploview and I am trying to carry out the procedure online.
Once I select 'I want to upload my own genotype data as a PED file' I proceed to upload my .ped file clicking the button 'choose file' under the heading 'linkage format ("ped" file)'.
Thanks
If using the Broad Institute's online service, you should contact them.