Biostar Beta. Not for public use.
How can i compute DNA methylation levels around splice junctions?
0
Entering edit mode
3.6 years ago
Germany

Hello everybody,

i am still new to the field of computational epigenetics, so i need some help with the following task(s):

I study applied bioinformatics and in the context of my master thesis, i need to compute methylation levels around splice junctions. I need to output it in a format that i have never seen before. I did some research about the format, but i couldn't find anything about it. 'The format seems to be similar to fasta, but instead of a sequence (after the header starting with ">"), it provides methylation levels in a tab-seperated manner, and i honestly don't know what DSQ stands for. A small part of a methylation track is given below is given below:


chr1:142346773:142346881:+@chr1:142380702:142380810:+@chr1:142404277:142404426:+_expu=400_expd=200_bsz=20_part=0
DSQ 18.5594 18.5594 18.5594 8.22605 18.5594 31.9349 36.4521 36.4521 33.8659 18.5594 8.22605 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
chr1:58852214:58852582:+@chr1:58878691:58878806:+@chr1:58880759:58881091:+_expu=400_expd=200_bsz=20_part=0
DSQ 0 0 0 0 0 0 0 0 4.50575 ...


This format is recognized by a newly developed flexible self-organizing map for DNA methylation analysis (or other digitized epigenetic signals). The paper describing the software is freely accesible here. Unfortunately besides a paper describing this software, the authors provide a 3-page-quick-start-manual, which doesn't tell much about this format shown above, but maybe someone here has seen this format before and can explain me the anatomy of it.

What i have done so far:

  1. I downloaded RNA-Seq runs from human spleen sample provided by NIH Roadmap Epigenomics Project. The GEO accesion is GSM1010976.
  2. I used TopHat splice junction mapper in order to determine splice junctions and therfor used hg19 as reference genome.

I need to compute:

  • The methylation levels in the range -200nt/+200nt to the left/right of these splice junctions respectively
  • I need them in 20nt intervals. These DSQ values seen in the above example represent the (normalized?) methylation levels within a 20nt bin

I also found the data of whole genome BS-Seq experiment which was done for the same spleen sample. The GEO accesion is GSM983652. I considered the following possibilites:

  1. If i understand correctly, the provided wig-file already contains methylation data. If that is the case, i would like to use the already existing methylation data. Is there a tool to extract methyation data out of a wig file? As i said before i need the cytosine methylation levels near splice junctions and i need them to be exported in the format shown above.
  2. If option 1 doesn't work, which tool should i use to analyse the provided BS-Seq data? And again: How can i export them in the format shown above?

I hope that somebody can help me with these tasks.

Best regards

thefirstrealace

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1