Biostar Beta. Not for public use.
Cloud Computer Cluster VS Local Compute Cluster for RNAseq analysis
0
Entering edit mode
2.9 years ago
Chen Mor • 0

Hey everyone!

What are you thoughts about using a local cluster VS a cloud based one for doing RNASeq analysis? Any pros and cons you can share from you own experiences?

Best, Chen

ADD COMMENTlink
0
Entering edit mode

Can't think of anything significant which would make one better than the other. Accessibility usually gives the edge to cloud based set ups (not everyone has the luxury of a private server or cluster). If you use cloud based VMs, you have the bonus of the server being 'all yours' for a while, so you can abuse it somewhat. Really depends what you need/already have.

ADD REPLYlink
1
Entering edit mode
12 months ago
EMBL Heidelberg, Germany

Cost-wise choice would depend on how you get charged for using your local cluster and associated storage. For one-offs and short-term projects a cloud-based solution may be cheaper but for regular use in the long run, a local cluster tends to be cheaper (especially when taking into account mistakes, bugs ...). Cloud-based solutions may have a cost in terms of data transfer and upload/download of large amounts of data can be significantly slow (and may only be possible by using something like Amazon's snowball or Amazon's snowmobile).
The main advantage of cloud-based storage would be for sharing data with people outside your institute.

ADD COMMENTlink
0
Entering edit mode
14 months ago
h.mon 25k
Brazil

What kind of analyses do you need to run? For differential gene or transcript expression, the latest methods (such as Salmon or Kallisto) are so fast and light on resources that a regular laptop can perform them quickly, making cluster and cloud resources unnecessary. The constraint is the size of fastq files - do they fit on your disk or not?.

See some discussions and examples here, here, here and here.

ADD COMMENTlink
0
Entering edit mode
3 months ago
genomax 68k
United States

Take into account local security policies at your institution/company. If that policy does not allow you to use external/cloud based resources then your choice would be limited to using local resources. If you work with human data (or data subject to privacy restrictions) that will add another layer of complexity and may require you to have specific agreements with the providers (e.g. if you use Amazon cloud then you may have to ask them to keep your data in a certain geographical jurisdiction).

That said, if you needed to get ~5000 samples analyzed in a week there is simply no substitute for using a cloud based provider like google compute/amazon AWS. Cost would be (relatively) inexpensive (when considering time/infrastructure) when you can dial up thousands of cores on demand.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1