There's technically no "right" answer for this question but I wanted to get some opinions. I have started building a codon optimization tool in JavaScript. Building the script to do this optimization is the easy part. The part I want some suggestions on is how to run it at scale.
Our use: We would probably run a max of 10-20 amino acid sequences at any given time but ideally I should make it so the script can scale for 1000+ sequences.
What I was thinking: I'm currently testing web workers to see if I can process sequences with child processes on the user's computer. My only concern with this is how will it perform with 1000+ sequences? (probably not that good)
Has anyone used AWS tools to achieve something like this? Our sequences are stored in Mongodb so i've been looking into AWS Lambda to see if I can optimize on database insert.
I'm sorry, what is codon optimization?
https://www.idtdna.com/pages/education/decoded/article/benefits-of-codon-optimization
Thank you. So you're essentially reverse-translating AA sequences and picking the highest frequency codon for each AA?
While there's technically no set in stone way to optimize expression when going from AA to DNA, there is a set of rules that have been found to help yield high expression. My script is designed to make the translation from AA to DNA while checking for these discovered rules. Some sequences can take many tries to find the appropriate DNA sequences that falls within our specs and can be very computationally heavy. I'd like an optimal way suggestion on how to run these computations at scale.
I don’t think codon optimisation is a sufficiently difficult task that it would require specialist infrastructure. Your laptop alone could probably churn through thousands of sequences without breaking too much of a sweat.