Sorting RNA-seq data
3
0
Entering edit mode
5.5 years ago
agrisimo2 • 0

Hello,

I have some RNA-seq data on an Excel spreadsheet. My gene/s of interest follow a particular expression pattern. I would like to know if it is possible to sort the data/identify genes that match 3 conditions set by me. For example, I'd like to be able to see all the genes that match this expression pattern :

Cell line A < Cell Line B < Cell line C > Cell line D
RNA-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode

Please provide example input and expected output.

ADD REPLY
0
Entering edit mode

For example-

           Cell A         Cell B       Cell C     Cell D
Gene A    0.115175459   3.484635909 6.571842857 4.349035833
Gene B    0.021664012   2.939972182 3.448264286 3.8535915
Gene C    0.014484529   3.347903818 5.250840143 4.148886458
Gene D    0.0749899     33.82436091 52.07118571 30.74083333

The command should exclude gene B from the data set, since it does not follow the pattern A < B < C > D, and provide me with a list of genes that does so Gene A,C and D.

ADD REPLY
1
Entering edit mode

I have some RNA-seq data on an Excel spreadsheet.

enter image description here

ADD REPLY
2
Entering edit mode
5.5 years ago
EagleEye 7.5k

If you are using Microsoft Excel,

=IF(AND(A1<B1,B1<C1,C1>D1),"YES","NO")
ADD COMMENT
2
Entering edit mode
5.5 years ago

With awk that would be:

awk '($2 < $3 && $3 < $4 && $4>$5 )'
ADD COMMENT
1
Entering edit mode
5.5 years ago
zx8754 11k

Using R:

# read the file, something like:
df1 <- read.table("myFile.txt")

# then filter
df1[ df1$CellA < df1$CellB & df1$CellB < df1$CellC & df1$CellC > df1$CellD, ]
ADD COMMENT

Login before adding your answer.

Traffic: 2616 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6