Biostars beta testing.
Question: Sorting RNA-seq data
0
Entering edit mode

Hello,

I have some RNA-seq data on an Excel spreadsheet. My gene/s of interest follow a particular expression pattern. I would like to know if it is possible to sort the data/identify genes that match 3 conditions set by me. For example, I'd like to be able to see all the genes that match this expression pattern :

Cell line A < Cell Line B < Cell line C > Cell line D
ADD COMMENTlink 16 months ago agrisimo2 • 0 • updated 16 months ago zx8754 7.5k
Entering edit mode
0

Please provide example input and expected output.

ADD REPLYlink 16 months ago
zx8754
7.5k
Entering edit mode
0

For example-

           Cell A         Cell B       Cell C     Cell D
Gene A    0.115175459   3.484635909 6.571842857 4.349035833
Gene B    0.021664012   2.939972182 3.448264286 3.8535915
Gene C    0.014484529   3.347903818 5.250840143 4.148886458
Gene D    0.0749899     33.82436091 52.07118571 30.74083333

The command should exclude gene B from the data set, since it does not follow the pattern A < B < C > D, and provide me with a list of genes that does so Gene A,C and D.

ADD REPLYlink 16 months ago
agrisimo2
• 0
Entering edit mode
1

I have some RNA-seq data on an Excel spreadsheet.

enter image description here

ADD REPLYlink 16 months ago
Pierre Lindenbaum
120k
2
Entering edit mode

If you are using Microsoft Excel,

=IF(AND(A1<B1,B1<C1,C1>D1),"YES","NO")
ADD COMMENTlink 16 months ago EagleEye 6.4k • updated 16 months ago zx8754 7.5k
2
Entering edit mode

With awk that would be:

awk '($2 < $3 && $3 < $4 && $4>$5 )'
ADD COMMENTlink 16 months ago Pierre Lindenbaum 120k • updated 16 months ago zx8754 7.5k
1
Entering edit mode

Using R:

# read the file, something like:
df1 <- read.table("myFile.txt")

# then filter
df1[ df1$CellA < df1$CellB & df1$CellB < df1$CellC & df1$CellC > df1$CellD, ]
ADD COMMENTlink 16 months ago zx8754 7.5k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0