Bacterial population reconstruction
0
0
Entering edit mode
3.8 years ago
SemiQuant ▴ 80

Hi,

I am doing amplicon sequencing (100 000x) on several targets of a bacteria. I expect most of the cultures to be clonal, however, there will be instances of heterogeneous populations. I want to, using the pair-end short-read sequencing, determine the haplotypes fo the bacteria present. I first want to determine the phase of the mutations, I've done extensive searching but can't seem to find tools that incorporates information about read 2.

==========Read 1========...==========Read 2========
-----C------------------...------------------------
-----C------------------...------------------------
-----C------------------...------------------------
-----A------------------...------------------------
-----A------------------...------------------------
-----A------------------...------------------------
-----A--------G---------...---------G--------------
-----A--------G---------...---------G--------------
-----A--------G---------...---------G--------------
-----A--------G---------...---------G--------------
                           ==========Read 1========...==========Read 2========
                           ---------G--------------...----------T-------------
                           ---------G--------------...----------T-------------
                           ---------G--------------...----------T-------------
                           ------------------------...----------T-------------
                           ------------------------...----------T-------------

So this will use the information from both reads for phasing. So we will get:

-----C------------------------------------------????????????????????????
-----A------------------------------------------????????????????????????
-----A--------G------------------G------------------------T-------------
????????????????????????----------------------------------T-------------

Maybe I could iterate over the mutations with pysam and determine the phase?

Id also like the frequencies, and then to determine the most likely haplotypes of the distinct bacterial populations. Not sure how to do that either, but I "expect" a clonal population, so I suppose I would try to use the frequencies to get the minimal number of distinct haplotypes.

I feel there must be tools already developed, maybe for microbiome analyses, that I'm missing? Otherwise, any thoughts on how to tackle this?

haplotype phase phasing population • 682 views
ADD COMMENT

Login before adding your answer.

Traffic: 2850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6