Taking publicly available Affy SNP6.0 data, I am trying to find the normalized signal for each probe. I've used both the "oligo" and "crlmm" from Bioconductor, and these generate a SnpSuperSet variable where I can then use calls(x) to find the genotype (AA=1, AB=2, or BB=3) or I can use confs(x) to find the p-value for this call. My code and output is below.
Rather than the calls themselves, I'd like to go one step back, and extract a matrix of signal intensity, so that I can go in and make the genotype calls myself. The publication states that there can be as little as 40% tumor DNA in the sample, therefore I'm concerned that crlmm is making erroneous calls (since there is an overrepresentation of "normal" tissue in the sample).
Thank you for the help!
library(oligo)
celFiles <- list.celfiles(celDirectory,full.names=T)
rawData <- read.celfiles(celFiles)
crlmm(celFiles,outDir)
x <- getCrlmmSummaries(outDir)
print(x)
SnpSuperSet (storageMode: lockedEnvironment)
assayData: 906600 features, 17 samples
element names: alleleA, alleleB, call, callProbability, F
protocolData: none
phenoData
rowNames: 1 2 ... 17 (17 total)
varLabels: crlmmSNR
varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.genomewidesnp.6
calls (x)
A B C D E
SNP_A-1780270 3 3 2 2 3
SNP_A-1780271 1 1 2 1 1
SNP_A-1780272 3 3 2 3 3
SNP_A-1780274 2 2 2 2 2
SNP_A-1780277 1 1 2 1 2
SNP_A-1780278 3 3 3 3 3
confs (x)
A B C D E
SNP_A-1780270 0.0009991370 0.0009989767 0.0009974529 0.0009988142 0.0009991313
SNP_A-1780271 0.0009699487 0.0009970648 0.0009924274 0.0009945284 0.0009670124
SNP_A-1780272 0.0009991341 0.0009950610 0.0009926793 0.0009992551 0.0009990913
SNP_A-1780274 0.0009923162 0.0009886114 0.0009978304 0.0009943256 0.0009975597
SNP_A-1780277 0.0009973942 0.0009951006 0.0009981321 0.0008798894 0.0009976221
SNP_A-1780278 0.0009932014 0.0009901400 0.0009991727 0.0009992094 0.0009988690