I am conducting CNV analysis on TCGA Level 3 SNP 6.0 data. In few of the downloaded tumor samples, I found more than 1 seg files associated with same TCGA submitter ID. For example,
TCGA-44-2656-01A CUTCH_p_TCGAb_355_37_52_NSP_GenomeWideSNP_6_H10_1376764.nocnv_grch38.seg TCGA-44-2656-01A EGGAR_p_TCGAb33and37_SNP_N_GenomeWideSNP_6_H04_585228.nocnv_grch38.seg TCGA-44-2656-01A HILLY_p_TCGA_b90_wRedos_SNP_N_GenomeWideSNP_6_A04_748062.nocnv_grch38.seg
The first file showed 415 rows. While the second and third files showed 197 and 271 rows, respectively. All three files showed Mean Seg Score for Chromosomes from 1 to 22 and X.
Under this kind of situation, what factors I should consider to select one of the three files to continue my downstream analysis?
Should I combine those chromosome regions that have overlapped fully or partially among the 3 files, if I decide to combine the seg data of the 3 files?
The UUID's of the above 3 samples followed the same order are: 89327245-3da1-4a96-bee3-5b84ae43401a, f19650bb-8ead-490b-9f91-d7c4b06bfe6b, 0e6071db-c44c-4958-ab95-087d44620893
Is there a means that I could find out more about the 3 samples with the above UUID's to facilitate the file selection?