Compute a identity matrix from an alignment - command line approach for batch processing
2
0
Entering edit mode
7.3 years ago
fhsantanna ▴ 610

I have an 16S rDNA alignment in FASTA format and I want to generate an identity matrix from it. Bioedit has a tool for this purpose, however I need a command line one for batch processing. Do you know a script able of doing it?

alignment identity batch • 6.4k views
ADD COMMENT
1
Entering edit mode

It might be helpful if you described the problem in more detail. For example - "I have hundreds of alignments and I want to generate an identity matrix for each one of them"

This does not have the necessary information. Do you mean that you have hundreds of files? One file with lots of alignments? What is the format of your alignments? ...etc. So, please clarify.

ADD REPLY
2
Entering edit mode
7.3 years ago
Michael 54k

DNA or AA? Otherwise, computing a distance matrix on MSA is a basic task in phylogenetics, done - for example - by long-existing tools:

EMBOSS: distmat

MEGA can also calculate distance matrices.

ADD COMMENT
0
Entering edit mode

Actually I need the opposite. Emboss and DNAdist calculate the distance. I just need the identity values among the sequences. I am working with rRNA 16S sequences, and their identity values are important for bacterial species identification.

ADD REPLY
0
Entering edit mode

So, what do you want to calculate:

  1. For each pair, the number of identical positions
  2. the proportion of identical values (% identity)
  3. the actual sequences that are identical
ADD REPLY
0
Entering edit mode

I would like to compute the % identity.

ADD REPLY
0
Entering edit mode
4.6 years ago
gadget • 0

See PIM from MUSCLE

ADD COMMENT

Login before adding your answer.

Traffic: 2707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6