Compare nucleotides across aligned sequences
0
0
Entering edit mode
5.9 years ago
brimaloney • 0

I have sequences for a gene of interest across multiple species that I hope to run a comparison for of each nucleotide in a sequence across every other, comparing base pairs of 1-2, 2-3, 1-3, ect.

I would also like to exclude any regions where the sequence had to fill in gaps or where there is poor alignment, noted by "-". Is there any command in biopython I could use as a starting point for this?

alignment sequence identity similarity biopython • 1.3k views
ADD COMMENT
0
Entering edit mode

Hello,

can you please give an example how your input looks like and how your output should look like?

fin swimmer

ADD REPLY
0
Entering edit mode

seq1: aactgta--tc seq2: aaatgtat-cc output: aaxtgta--xc x marking where there is a difference between the sequences, - marking where at least 1 of the pairs does not have information at that position

ADD REPLY
0
Entering edit mode

Hello,

you can just iterate over each position in both sequences and compare the character:

seq1 = "aactgta--tc"
seq2 = "aaatgtat-cc"

output = ''

for a, b in zip(seq1, seq2):
    if (a == b) and '-' not in (a, b):
        output += a
    elif '-' in (a, b):
        output += '-'
    else:
        output += 'x'

print(output)

fin swimmer

ADD REPLY
0
Entering edit mode

multiple pairwise sequence alignment?

pairwise alignment for multiple sequences in a file

ADD REPLY
0
Entering edit mode

the sequences are already aligned i would like to just pull out the positions where there is a mismatch in the basepairs at a given position between each pair of sequences

ADD REPLY

Login before adding your answer.

Traffic: 2599 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6