Forum:R or python, which one do you prefer in analysing scRNAseq datasets?
0
0
Entering edit mode
5.9 years ago
wt215 • 0

Hi,

The number of cells from scRNAseq experiment can be very large. Especially for recent 10X datasets, a dataset contains around 1.3 million cells, which is very large.

R seems to have trouble even in loading the raw gene-cell expression count table. I am not very familiar with Bioinformatics in Python, can python handle such large dataset easily?

Given such large datasets, many normalization methods which utilized Bayesian methods or optimization algorithm could be time consuming. Which language do you think that could win, R or python?

Thanks in advance.

R RNA-Seq python • 5.1k views
ADD COMMENT
4
Entering edit mode

Software is only as good as the underlying algorithm. If that is flawed then software (using that algorithm) running faster with one particular language does not make that language/package a winner.

Good programmers will work around technical difficulties. Parts of a program can be coded in a different language (if that offers technical advantages) and then called from within a program.

ADD REPLY
0
Entering edit mode

Yes I agree. I am a bit worried that the development of hardware cannot keep up with the development of scRNAseq techniques.

The data is getting bigger and bigger, especially for sequencing fastq data and hence the increasing number of cells stored in the count table.

I really hope that there is one day that my laptop can handle both preprocessing fastq files as well as downstream analysis easily.

ADD REPLY
3
Entering edit mode

"my laptop"

who told you that was an acceptable platform?

ADD REPLY
0
Entering edit mode

Large datasets are always going to require access to appropriately sized hardware. Ideally you would be able to have access via your company/institute/university but if that is not an option then cloud based providers do have solutions that will fit, even now. They will be pricey to pay out of pocket.

Your laptop (if it retains that form factor in future) may handle much larger data but we are sadly a ways away from that day.

ADD REPLY
2
Entering edit mode

Which software do you think that could win, R or python?

Neither of those are software, but programming languages. Both can be completely shit when you don't use them right, and both can solve your issue with loading raw gene-cell expression data if you use them correctly.

A lot of scRNAseq packages are written in R.

ADD REPLY
0
Entering edit mode

Sorry, my mistake, should be language rather than software. Thanks for pointing it out.

ADD REPLY
0
Entering edit mode

Your bottle-neck is likely not going to be the choice of language. It's going to be the availability of existing packages to do what you want to do. Python will likely be faster for loading large datasets, but if there aren't already packages for scRNA-seq analysis, are you going to spend the time to write your own? I guess it will come down to what is better time spent: writing something new in the faster language or cobbling together existing things in either languages to accomplish your goal.

ADD REPLY
0
Entering edit mode

Languages are tools and if Python and R are my only choices, I pick Rython.

ADD REPLY

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6