I need some tips and suggestions for further analysis of NGS expression data (log2cpm)
0
2
Entering edit mode
5.1 years ago
augihol ▴ 20

I am a PhD student who inherited some log2cpm data of expression data from bulk kidney tissue from a UUO(unilateral urethral obstruction) experiment that tests a new drug. The sample material consists of:

  • 6 x Ligated Kidney (Untreated
  • 6 x Ligated Kideny (Treated)
  • 3 x Unligated Kidney (Untreated)
  • 3 x Unligated Kidney (Treated)

The previous study showed that this drug was effective against fribrosis, so the aim of my study is to investigate how this drug affects inflammation and mitochondrial function in the diseased kidney.

So I have filtered my original dataset based on MitoCarta v2 DB, so investigate how the drug affected mitochondrial function. I afterwards run a PCA analysis on log2cpm data which revealed the following: enter image description here

The following figure shows PC1(84.51%) and PC2(4.69%). Red (Ligated, Untreated), purple (Ligated, Treated), green(healthy kidney, not treated), orange (Healthy kidney, treated).

I interpret this as there is a very small difference between treated and not treated ligated kidneys, however there is a small difference.

I have just started doing bioinformatics and I am very unsure which approach I should choose further, however I have though maybe do a highly variable genes analysis (HVG) see link, isolate the 500 most variable genes and run a heatmap with hieraichal clustering. I was also thinking of looking at the loadings in PC2, and take this further. The truth is that I am very unsure what is the right approach, and my research group does not have bioinformatician I can ask anymore.

I have the following questions:

  • What kind of analysis you guys would suggest I use to investigate further (Hieraichal clustering, HVG etc?)
  • Is there any alternative preprocessing I could use before I run the PCA, to make the analysis more accurate?
  • Are there any packages that you can suggest?
  • Any other tips?

Any other tips?

RNA-Seq R genome gene next-gen • 1.3k views
ADD COMMENT
0
Entering edit mode

Hello augihol!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/7192/i-need-some-tips-and-suggestions-for-further-analysis-of-ngs-expression-data-lo

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Thats true, I got this suggested by some other person. I just want to maximize my chance of getting an answer. I can of course delete the other post - if people find it annoying :) The impression was that this community is bigger, so the chances of getting an answer at all is bigger :)

ADD REPLY
0
Entering edit mode

Annoying is perhaps a too strong word here. We don't really like crossposting, as it leads to volunteers in both communities spending time for your question, where potentially one would suffice. If it is clear that it's crossposted (by providing links) then that's already less bad ;-)

ADD REPLY
0
Entering edit mode

Hey, I provided information about crossposting in the other link, so I dont upset anyone :)

ADD REPLY

Login before adding your answer.

Traffic: 1693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6