Differences between parSeqSim and twoSeqSim results
1
0
Entering edit mode
5.7 years ago

Hi!

I am trying to compare 100 peptide sequences to each other using the default settings of twoSeqSim and parSeqSim from the protr package (local alignment and BLOSUM62 substitution matrix). However, the results are different using the two functions. Using my CompareAll function, which executes twoSeqSim multiple times to compare all peptides in a vector, I've got integer scores in the similarity matrix. However, when I run parSeqSim on the same peptide set, it seems that it somehow normalizes the result values, since the results are between 0 and 1. How does this normalization work? Thanks!

# twoSeqSim    
CompareAll <- function(eps) { # does pairwise comparisions for every peptides in the vector
      simmtx <- matrix(nrow = length(pep),
                       ncol = length(pep),
                       dimnames = list(pep, pep))
      for (i in 1:length(pep)) {
        for (j in i:length(pep)) {
          simmtx[i, j] <- twoSeqSim(pep[i], pep[j])@score
        }
      }
      return(simmtx)
    }

# parSeqSim
parSeqSim(peptides_tmp)
R protr sequence similarity • 1.1k views
ADD COMMENT
2
Entering edit mode
5.7 years ago
h.mon 35k

The normalization performed by parSeqSim() is:

if ( is.numeric(s12) == FALSE |
     is.numeric(s11) == FALSE |
     is.numeric(s22) == FALSE ) {
  sim = 0L
} else if ( abs(s11) < .Machine$double.eps |
            abs(s22) < .Machine$double.eps ) {
  sim = 0L
} else {
  sim = s12/sqrt(s11 * s22)

}

Where s11 is the score of sequence1 aligned to itself, s22 is the score of sequence2 aligned to itself, and s12 is the score of sequence1 aligned to sequence2.

This means if any score is non-numeric, or if either s11 or s22 are really, really small, then sequence similarity is set to zero; otherwise, sequence similarity is given by s12 / sqrt (s11 * s22 ).

ADD COMMENT

Login before adding your answer.

Traffic: 1844 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6