Hi all,
I'm a novice in Hi-C analysis.
I'm looking at the following paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5478386/
* A Compendium of Chromatin Contact Maps Reveal Spatially Active Regions in the Human Genome *
Here, we report the most comprehensive survey to date of chromatin organization in human tissues. Through integrative analysis of chromatin contact maps in 21 primary human tissues and cell types, we found topologically associating domains highly conserved in different tissues. We also discover genomic regions that exhibit unusually high levels of local chromatin interactions.(...)
The associated GEO data is https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87112
I downloaded the archive (32Go) GSE87112_file.tar.gz . The content of this file is here.
In my dreams, I expected to find a table like
interval1 interval2 score
but I found some obscure data (to me)
$ tar xOvf all_data_contact_maps.tgz contact_maps/HiCNorm/primary_cohort/IMR90.nor.chr1.mat | fold -w 60 | head
contact_maps/HiCNorm/primary_cohort/IMR90.nor.chr1.mat
0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
0000 0.000000 0.000000 0.000000 0.00
$ tar xOvf primary_cohort_TAD_boundaries.tgz primary_cohort_TAD_boundaries/AD.IS.All_boundaries.bed | more
primary_cohort_TAD_boundaries/AD.IS.All_boundaries.bed
chr10 4880000 4920000
chr10 6000000 6040000
chr10 7760000 7800000
chr10 9360000 9400000
chr10 12000000 12040000
chr10 13320000 13360000
chr10 14520000 14560000
chr10 15400000 15440000
chr10 17680000 17720000
chr10 18520000 18560000
chr10 19520000 19560000
chr10 21240000 21280000
chr10 22200000 22240000
chr10 23440000 23480000
chr10 24160000 24200000
$ tar xOfz all_data_FIRE_calls.tgz all_data_FIRE_calls/PO.FIRE.bed | head
chrchr start end
chr1 5280000 5320000
chr1 8240000 8280000
chr1 8280000 8320000
chr1 8400000 8440000
chr1 8440000 8480000
chr1 8480000 8520000
chr1 8520000 8560000
chr1 8560000 8600000
chr1 8600000 8640000
is it possible to find the significant interacting intervals (per tissue) in this dataset ?
Thanks Devon, looking at https://github.com/ay-lab/fithic . It looks like it needs an 'interaction file':
that is missing in the archives (?)
Yeah, it appears that they didn't upload the most useful files. I've talked to folks internally and the consensus is that it's easiest to see if they can simply provide them.
Hello everybody! I know, this is a late reply, but actually I am at the moment running in the same problem as you once did Pierre :D. (How) Were you able to solve this issue in the past?
Best, Andreas
I think I gave up