Dicotomization RNA-seq dataset

0

Entering edit mode

4.4 years ago

JulianC ▴ 30

Hi!

I am working with a RNA-seq dataset and I have many genes with the corresponding FPKM values. I would like to dicotomize the values based on thresholds. For example, each value major than 0.5 becomes a 1 (expressed) and every value less than 0.5 becomes a 0 (not expressed). I determined this threshold for each gene and I stored this list of thresholds in a list, in Python. My dataset looks like this:

GeneA  GeneB  GeneC GeneN
 x1A    x1B    x1C   x1N
 x2A    x2B    x2C   x2N
 xnA    xnB    xnC   xnN

I would like to perform this operation:

df[df["GeneA"] < threshold] = 0
df[df["GeneA"] > 0] = 1

and each gene has its own threshold so what I am trying to do is an operation on each column of the dataset but each operation based on a value (threshold) that differs from column to column. Let's imagine that I have a list named "threshold" in which I have all the values.

Could you suggest me an effective way to do it? Thanks!

python RNA-Seq • 604 views

ADD COMMENT • link 4.4 years ago by JulianC ▴ 30

Login before adding your answer.