Identifying mutations in an aligment
1
0
Entering edit mode
6.1 years ago
Tonor ▴ 480

Does anyone know of any tools that will identify the mutations present in an alignment?

I have an alignment of hundreds of gene seqs (all closely related), the first sequence in the alignment is the "reference" sequence, and I want to identify and report mutations in the gene seqs with respect to the reference sequence.

The second part, is to identify if those mutations of synonymous/non-synonymous (the gene seqs are coding and are in frame).

Ideally, this would all be command-line based.

alignment sequence • 5.3k views
ADD COMMENT
0
Entering edit mode

first, I am not an expert on this. But should you not use something like bowtie and freebayes.

Here some other answers:

Finding Snps From A Bam File

What Methods Do You Use For In/Del/Snp Calling?

ADD REPLY
0
Entering edit mode

Hi - this is not NGS data - so not SAM/BAM just a standard FASTA alignment of aligned gene sequences

ADD REPLY
0
Entering edit mode

I don't think there is a tool for this specific purpose. I'm afraid you will need to write your custom script (e.g. Python/Ruby/Perl) to list the mutations.

ADD REPLY
7
Entering edit mode
6.1 years ago

I think you need to write your custom script for this task. For example, look at the following code in Python:

File: test.py

import sys
from Bio import SeqIO

f = sys.argv[1]

seq_records = SeqIO.parse(f, 'fasta')
refseq_record = next(seq_records)

for seq_record in seq_records:
    for i in range(0, len(refseq_record)):
        nt1 = refseq_record[i]
        nt2 = seq_record[i]
        if nt1 != nt2:
            printseq_record.id, i+1, nt2, nt1)

Alignment in FASTA file (input.fasta):

>seq1
ATGCTGATG
>seq2
ATGCTGACT
>seq3
ACGCT-ATG

Running the script:

python test.py input.fasta

Output:

seq2 8 C T
seq2 9 T G
seq3 2 C T
seq3 6 - G

Columns: (i) sequence identifier, (ii) position in the alignment, (iii) character in this sequence, (iv) character in the reference sequence.

ADD COMMENT
1
Entering edit mode

Thanks - I was going to write something - but thought there would be something already out there

ADD REPLY
0
Entering edit mode

What you are gonna do in case of insertions?
In that case, the reference sequence also contains gaps, and it is quite problematic to deal with.

ADD REPLY

Login before adding your answer.

Traffic: 2051 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6