Hi, I have two data sets containing mutations, one is from 1000 genomes and the other is my data. Below is an example row of each data set:
my_data (1 row and 13 columns):
3BHS2_HUMAN_A10E P26439 10 rs28934880 A E C A probably_damaging alignment neutral 0.328 0.145
1000G (1 row and 8 columns):
20 59132666 . G A 56.74 PASS AB=0.53;AC=6;AF=0.0102;AN=586;BaseQRankSum=3.459;BaseQRankSumZ=-0.123;DP=1115;Dels=0.00;HRun=1;HaplotypeScore=0.1646;MQ=95.36;MQ0=0;MQRankSum=1.213;MQRankSumZ=0..694;QD=4.19;ReadPosRankSum=1.963;ReadPosRankSumZ=0.349;SB=-0.49;VQSLOD=4.0533;set=ALL119
I'm writing a Python script where I would like for each mutation on each line in my _data match That mutation with the correct mutation (line) in 1000G. All I have is this information above. My question is how could I relate the information in my_ data with the information I have from 1000G? What I want is the chromosome position or to know that I'm looking at the same mutation (if it exists) in both files. Is this possible to achieve?