Genome wide variation between mutant and wild type - is it significant
1
0
Entering edit mode
7.8 years ago
Jan Hapala • 0

My task is to compare a yeast wt and a mutant sample. The mutant is known to bear a mutation at a specific locus. The question is: Aside from this known mutation, are the samples significantly different?

(In ideal case, there would be no other difference and we could say that the difference in the phenotype is given by the single locus only. However, can we really expect zero mutation rate? Hardly.)

This have been my first variation analysis and I'm not sure where to go now. I have produced a VCF file (code at the end of the post). But what would be a convincing result here? Statistics of different kinds of variations between samples? Compared to what?

The only reasonable step I can see is to focus on protein-coding genes and check potential amino-changing mutations. But this would be a crude simplification.

VCF calculation:

# samtools mpileup -uf reference.fasta wildtype.sorted.bam mutant.sorted.bam | bcftools view -bvcg - > mutant-vs-wildtype.var.raw.bcf
# bcftools view mutant-vs-wildtype.var.raw.bcf | vcfutils.pl varFilter -D100 > mutant-vs-wildtype.var.flt.vcf
SNP genome-wide yeast variation • 1.9k views
ADD COMMENT
1
Entering edit mode
7.8 years ago

I think the crude simplification is the best you can do. (Another thing you could try is to BLAST your those other mutations against nr, or some database of other yeast strains...if the mutations are in those strains, and those strains are phenotypiclaly normal, then the mutation likely doesn't do anything).

There is no objective metric by which you can say "This strain is significantly different". And there is no way to just look at a SNP and know how profound the consequences will be to the protein, or the organism. You just can't answer this conclusively in silico. People at the bench could correct the mutation, and see how the corrected yeast functions in whatever assays they care to try, that's the only way to be sure.

Best thing you could say is that you don't predict any significant impairment of this or that pathway, based on a lack of large amino acid changing mutations in this and that set of genes. But you can't be sure based on sequence data alone.

ADD COMMENT
0
Entering edit mode

Right, exactly how I see it. Thank you for the tip on blasting the other strains!

ADD REPLY

Login before adding your answer.

Traffic: 3404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6