Dicotomization RNA-seq dataset
0
0
Entering edit mode
4.4 years ago
JulianC ▴ 30

Hi!

I am working with a RNA-seq dataset and I have many genes with the corresponding FPKM values. I would like to dicotomize the values based on thresholds. For example, each value major than 0.5 becomes a 1 (expressed) and every value less than 0.5 becomes a 0 (not expressed). I determined this threshold for each gene and I stored this list of thresholds in a list, in Python. My dataset looks like this:

GeneA  GeneB  GeneC GeneN
 x1A    x1B    x1C   x1N
 x2A    x2B    x2C   x2N
 xnA    xnB    xnC   xnN

I would like to perform this operation:

df[df["GeneA"] < threshold] = 0
df[df["GeneA"] > 0] = 1

and each gene has its own threshold so what I am trying to do is an operation on each column of the dataset but each operation based on a value (threshold) that differs from column to column. Let's imagine that I have a list named "threshold" in which I have all the values.

Could you suggest me an effective way to do it? Thanks!

python RNA-Seq • 604 views
ADD COMMENT

Login before adding your answer.

Traffic: 1437 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6