scRNA-seq and Chip-seq data integration
1
0
Entering edit mode
5.9 years ago

Dear All,

I'm currently trying to integrate, a single cell data of the Drosophila embryo, and also Chip-seq experiments from same embryo, same stage. Which statistical framework would you suggest to establish a clear relation between Chip-seq enrichment (of common epigenetics marks e.g. H3k27me2) and how this could be affecting the determination of cell-clusters? Rather than simple correlation test.

Thanks

scRNA-seq Drop-seq ChIP-Seq RNA-Seq • 1.5k views
ADD COMMENT
2
Entering edit mode
5.9 years ago

Hola Carlos,

Instead of doing simple correlation, you could model the relationship between each epigenetic signal and the expression of genes surrounding the signal. What do I mean by 'model'? I mean build a linear regression model, as follows:

lm(NearbyGene1 ~ mark1H3k27me2)
lm(NearbyGene2 ~ mark1H3k27me2)
lm(NearbyGene3 ~ mark1H3k27me2)
lm(NearbyGene4 ~ mark1H3k27me2)
...
lm(NearbyGene1 ~ mark2H3k27me2)
lm(NearbyGene2 ~ mark2H3k27me2)
...
lm(NearbyGene1 ~ mark3H3k27me2)
lm(NearbyGene2 ~ mark4H3k27me2)

You will have to set this up as a loop. To use model formulae in a loop, you can create the model equation with paste() and then coerce it into a formula acceptable to the lm() function with as.formula().

To extract information from a model, use the summary() function - there are ways of extracting each individual value via subsetting.

The benefit of using a model is that you can also adjust for other covariates / confounding factors, for example:

lm(NearbyGene1 ~ m2H3k27me2 + TissueType)

Take a look here for other information related to linear regression models (and there's tonnes of information across the World Wide Web, too): A: Resources for gene signature creation

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin, This is a great answer, thank you very much. How many genes would you consider to test for the "Nearby Gene" comparisons?

ADD REPLY
0
Entering edit mode

You could just begin with, literally, each gene that is up- and down-stream of the H3K27 methylation site. If needed, you could extend it to include genes in a larger locus.

ADD REPLY
0
Entering edit mode

What do you think If I use logistic regression?

ADD REPLY
0
Entering edit mode

Sure, but, what are your x and y variables going into the model?

ADD REPLY

Login before adding your answer.

Traffic: 2751 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6