How to design an analysis pipeline for genes with SNPs by samples? Build a matrix?
2.1 years ago
Pin.Bioinf • 240
I have called SNPs for my samples in tumour samples at time1 and tumour samples at time2 (targeted sequencing) , and I would like to check if there is any significant variations of SNPs in my panel of genes between the two types of samples (many biological replicates per sample). How can I do this? Should I build a matrix of columns (samples) and rows (genes) for SNPs instead of counts like in RNASeq analysis? But how could I notate every mutation found in a gene in order to compare? What kind of analysis is used for this comparison of mutations?

I am completely lost as I have never done something like this, I have just worked with expression. Thanks for your help.

EDIT: I have something like this for each sample:

Chromosome  Position    Reference Allele    Variant Allele  Variant Type    Sequence Context    Consequence dbSNP   ID
chr1    27022992    A   G   SNV Interge nic,Coding       u  pstream_gene_variant,missense_variant
chr1    27099846    G   GTGCT   GTCTCTAT    ACACATC Insertion       C   oding  frameshift_variant
chr1    27101220    G   A   SNV Coding  missense_variant    COSM6604517 ARID1A

