How to estimate the memory requirements for Diamond given the database and query protein sizes?
0
0
Entering edit mode
17 months ago
O.rka ▴ 720

On our new servers we have to request the amount of memory and time needed for a job. We are charged per thread per memory requirement for the time taken to complete the job (not the time requested). Anyways, I'm trying to minimize costs for a larger job.

I have a database that is 68G and 48170345 protein sequences (11GB gzipped, ~19GB uncompressed).

I can either do the following:

  1. Run Diamond against all of the proteins at once (I feel like this would be the most expensive)
  2. Split 100 files and run separately (each one is about 189MB)

Which method would use less resources?

How can I estimate how many resources would be required per job?

alignment diamond memory requirements • 328 views
ADD COMMENT

Login before adding your answer.

Traffic: 1380 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6