Hi everyone, I am trying to work with Copy Number Variations (CNV).
I have samples for aCGH analysis using the Agilent 244K Chip by Myriad Genetics. Every sample is structured in this way:
1 -0.055518725
2 0.085221382
3 -0.650189314
4 0.085176382
5 0.007979838
6 0.089254443
7 0.078943541
8 0.02681327
9 0.200645608
10 -0.306892726
................
................
243499 -0.034404618
243500 -0.052923749
243501 0
243502 0
243503 -0.014568005
243504 -0.040646156
I don't know how to work with these data, because I didn't find any documentation, I don't know the meaning of the columns. So:
1) Can anyone give me a help, maybe telling me where I can find a good documentation for the output of Agilent 244K?
I have always worked with data stored in this way (this is an example of TCGA/GDC data of CNV, obtained as an output of Affymetrix Genome-Wide Human SNP Array 6.0):
Sample Chromosome Start End Num_Probes Segment_Mean
EMMER_p_8TCGA_Mx_242_238_N_GenomeWideSNP_6 1 61735 17112177 8814 -0.0283
EMMER_p_8TCGA_Mx_242_238_N_GenomeWideSNP_6 1 17114427 17262247 69 0.3902
My goal is to transform the data obtained from Agilent 244K (like the example above at the first point) into a more "friendly" structure as the data that can be downloaded from TCGA/GDC .
2) Where can I find a tool/script/algorithm that can transform the data as I want?
Thanks in advance.
Hi, but how can I relate these probes to coordinates such as "Chromosome", "Start" and "End"?
Don't you have the "map" of the Chip and the corresponding coordinates ? In our case my colleagues use a software named CytoGenomic they give it a sort of bed but the probe names are not a single number. Maybe try to ask Myriad Genetics , i could be by anything without columns names :(
I tried to find information about CGH-array Agilent but like you i wasted my time ...