Question

I need some tips and suggestions for further analysis of NGS expression data (log2cpm)

2

Entering edit mode

5.1 years ago

augihol ▴ 20

I am a PhD student who inherited some log2cpm data of expression data from bulk kidney tissue from a UUO(unilateral urethral obstruction) experiment that tests a new drug. The sample material consists of:

6 x Ligated Kidney (Untreated
6 x Ligated Kideny (Treated)
3 x Unligated Kidney (Untreated)
3 x Unligated Kidney (Treated)

The previous study showed that this drug was effective against fribrosis, so the aim of my study is to investigate how this drug affects inflammation and mitochondrial function in the diseased kidney.

So I have filtered my original dataset based on MitoCarta v2 DB, so investigate how the drug affected mitochondrial function. I afterwards run a PCA analysis on log2cpm data which revealed the following:

The following figure shows PC1(84.51%) and PC2(4.69%). Red (Ligated, Untreated), purple (Ligated, Treated), green(healthy kidney, not treated), orange (Healthy kidney, treated).

I interpret this as there is a very small difference between treated and not treated ligated kidneys, however there is a small difference.

I have just started doing bioinformatics and I am very unsure which approach I should choose further, however I have though maybe do a highly variable genes analysis (HVG) see link, isolate the 500 most variable genes and run a heatmap with hieraichal clustering. I was also thinking of looking at the loadings in PC2, and take this further. The truth is that I am very unsure what is the right approach, and my research group does not have bioinformatician I can ask anymore.

I have the following questions:

What kind of analysis you guys would suggest I use to investigate further (Hieraichal clustering, HVG etc?)
Is there any alternative preprocessing I could use before I run the PCA, to make the analysis more accurate?
Are there any packages that you can suggest?
Any other tips?

Any other tips?

RNA-Seq R genome gene next-gen • 1.3k views

ADD COMMENT • link 5.1 years ago by augihol ▴ 20

0

Entering edit mode

Hello augihol!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/7192/i-need-some-tips-and-suggestions-for-further-analysis-of-ngs-expression-data-lo

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY • link 5.1 years ago by WouterDeCoster 47k

0

Entering edit mode

Thats true, I got this suggested by some other person. I just want to maximize my chance of getting an answer. I can of course delete the other post - if people find it annoying :) The impression was that this community is bigger, so the chances of getting an answer at all is bigger :)

ADD REPLY • link 5.1 years ago by augihol ▴ 20

0

Entering edit mode

Annoying is perhaps a too strong word here. We don't really like crossposting, as it leads to volunteers in both communities spending time for your question, where potentially one would suffice. If it is clear that it's crossposted (by providing links) then that's already less bad ;-)

ADD REPLY • link 5.1 years ago by WouterDeCoster 47k

0

Entering edit mode

Hey, I provided information about crossposting in the other link, so I dont upset anyone :)

ADD REPLY • link 5.1 years ago by augihol ▴ 20