Recalling SNPs count from Illumina SNP array raw data
3
2
Entering edit mode
8.2 years ago
nadne ▴ 40

I want to recall SNPs from Illumina HumanOmni2.5-4v1. I have the raw data (Grn.idat and Red.idat) files, and also a matching FinalReport.csv which include the next columns:

SNP Name    Sample Name    GC Score    Allele1- Forward    Allele2- Forward    Allele -Top    Allele2-Top    Allele1-Design    Allele2-Design    Allele1-AB    Allele2-AB    Theta    R    X    Y    X Raw    Y Raw    B Allele Freq,    Log R Ratio

And I have thousands of such files. How can I get the number of calls for a specific SNP?

Should I use the raw idat files, or CSVs? And if the answer is the CSVs, then which column, and how to interpret it?

Thanks

SNP Illumina • 4.0k views
ADD COMMENT
2
Entering edit mode
8.1 years ago
nadne ▴ 40

I finally ended up using R package named crlmm. It has a function named genotype.Illumina that does the recalling given the raw idat files.

ADD COMMENT
0
Entering edit mode
8.2 years ago

If you know who did the sequencing or who pre-processed this data, I'd ask them for Plink friendly files (bim, bed, fam). The IDAT files can be used to call the genotypes, through Illumina's GenomeStudio, there may be an open source option with the CRLMM package, but I wouldn't be able to tell you if it'd work with your chips, you'd have to do some reading.

ADD COMMENT
0
Entering edit mode

I have the PLINK, but I want to process the raw data, as the bim-bed-fam files have many missing calls, and I'm trying to figure out why, or can I recall it.

I tried using CRLMM, and it reads the idat files, and gives two number for each SNP: calls and confidence. But do I do with this two? How do I know what is the genotype when a SNP has a call of 10210 and and a confidence of 0.4898242?

ADD REPLY
0
Entering edit mode

Chances are there's a lot of missing calls because they've not been imputed. Check out IMPUTE2

ADD REPLY
0
Entering edit mode

That's probably true, and yet, I want to see what was the genotyping in the raw calls, before imputation was done.

In addition, it is not obvious how to use IMPUTE2 with these files.

ADD REPLY
0
Entering edit mode
4.3 years ago

To recall IDAT files using Illumina proprietary GenCall algorithm there are now two approaches:

(i) using the Illumina Array Analysis Platform

(ii) using the Illumina Beeline/AutoConvert software

I describe how to use either approach on Linux here

ADD COMMENT

Login before adding your answer.

Traffic: 3443 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6