scATAC-seq analysis, data preprocessing
0
0
Entering edit mode
4.9 years ago
chipolino ▴ 150

Hi,

During scATAC-seq data preprocessing, does it make sense to filter data matrix, so it contains only most variable peaks (in the same way how we do it for scRNA-seq), before any further dimensionality reduction or clustering analysis?

Thanks

scATAC-seq • 1.8k views
ADD COMMENT
1
Entering edit mode

To better define cell types, it makes sense.

ADD REPLY
1
Entering edit mode

That depends on the type of analysis you're referring to. PCA, for example, will always focus on the most variable regions. I haven't looked at scATAC-seq data myself but given that it's basically binary, I'm not sure how well the typical variance measures even hold up.

ADD REPLY
0
Entering edit mode

can I do sparse PCA on scATAC-seq matrix and see, what peaks correspond to, let's say the first component? And choose those as the most informative (variable)?

ADD REPLY
0
Entering edit mode

Well, I'm not sure how "peaks" would be defined in scATAC-seq as there's a maximum of 2 reads per open region per cell. Maybe you want to collapse the information from multiple cells at the same region? What exactly is the question you're trying to address?

ADD REPLY
0
Entering edit mode

Usually, dimensionality reduction is done on top variable features (usually top 500). So you can take top variable peaks and build a PCA and see how the tSNE clusters looks like. If you want to overcome the sparsity of data, you could use KNN approach to merge data from n-similar cell. Before doing that I would check tSNE on top 500 variable peaks.

I did not know that the data is binary, so this paper seems to have a nice method to process the data.

ADD REPLY
0
Entering edit mode

Thanks! But how do you find most variable peaks, if the data is binary?

ADD REPLY
0
Entering edit mode

Sorry I am not aware that it's binary. I updated my answer and moved it to comment as it doesn't qualify as an answer anymore

ADD REPLY
0
Entering edit mode

Asked on BioC in the first place and then cross-posted here as suggested there.

ADD REPLY

Login before adding your answer.

Traffic: 2255 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6