Hi. I've used Biostars before but this is my first time posting. I'm complete novice in genomics and programming, but I learn fast. Hope you can help me with the following problem.
My PI gave me a file with the following format:
*Chr POS REF ALT AF
3 401373 C T 0.0483870967741935
3 534104 A G 0.0591397849462366*
The file records single nucleotide variants in chromosome 3, where AF indicates allele frequency in our study population (90 individuals). What I need is a similar file but for 90 random individuals from the YRI population in the One Thousand Genomes Project.
From the 1000Gp data portal I have downloaded a list of 111 sample names that meet my needs. I already picked 90 random sample names in this format:
*Sample name Sex Biosample ID Population code Population name ...
NA18853 male SAME124733 YRI Yoruba ...*
My question is: how do I create a file that reports Chr POS REF ALT AF just for the samples (population slice) in the list.
Thanks in advance.