map methylome data with GKNO mosaik program
3
1
Entering edit mode
9.7 years ago
zhiguangli88 ▴ 10

Hi,

I found the program mosaik in GKNO tool sets (Lee W-P, Stromberg MP, Ward A, Stewart C, Garrison EP, et al. (2014) MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short- Read Mapping. PLoS ONE 9(3): e90581. doi:10.1371/journal.pone.0090581) is a powerful mapping method. For pair-end reads, it utilizes the knowledge of approximate fragment length in library construction to rescue the ambiguously/multiple anchored read if its mate is uniquely aligned.

I am wondering whether this progam is able to process methylome data generated by MethylC-seq, and whether the above strategy is deployed in processing such data. My datasets were sequenced on HiSeq2000 with PE100 (pair-end 100bp).

Do anyone has any experience with this problem?

Thanks,

Zhiguang Li

methylC-seq GKNO mosaik • 2.6k views
ADD COMMENT
0
Entering edit mode
9.7 years ago

My I guess is no, it can't handle BS-Seq data.

If you want to align BS-Seq reads, from directional protocol, with an aligner not designed for BS-Seq data you could follow these steps (but you would probably reinvent the wheel...):

  • Convert all the C in reference genome to T. Append this converted genome to the one where all the G are converted to A. I.e. each original chromosome is present twice: as C to T converted and G to A converted. Make chromosome names unique (e.g. by appending to the original name _CT or _GA). Index this reference with your favourite aligner.
  • Convert all the C in your reads mate 1 to T. Convert all the G in mate 2 to A
  • Align converted reads to the converted reference.
  • Parse chromosome names in output SAM/BAM to have the original chromosome names (e.g. by removing _CT or _GA)

(Hope I got it right...)

ADD COMMENT
2
Entering edit mode

looks right. at the end, you also need to recover the original reads without the in silico conversions and report those instead of the converted.

ADD REPLY
1
Entering edit mode

Sure... Thanks for pointing it out. (And thanks for bwameth!)

ADD REPLY
1
Entering edit mode

sure, glad someone is using it.

ADD REPLY
0
Entering edit mode

Dear Dariober,

Thanks very much for your explanation. But looks like it is pretty complicated. Do you know any public available program that can handle pair-end methylome data?

Thanks again.

Zhiguang Li

ADD REPLY
1
Entering edit mode

bismark, which uses bowtie or bowtie2, it's probably the most popular (and for good reason!). Recently I got pretty good results with bwameth.py, which is backed by bwa mem (see also this Bwa-Meth: Align And Tabulate Bs-Seq Reads). Definitively, unless you have a good reason to use a specific aligner, don't reinvent the wheel!

PS: Your comment should have been a "comment" not an answer.

ADD REPLY
0
Entering edit mode
9.7 years ago
alistairnward ▴ 210

As mentioned above, Mosaik is not ideal for your task. I am not familiar with processing methylome data, but you would definitely want to employ a tool that has been developed specifically for handling this type of data. Mosaik was not developed with this in mind. We would be happy to host tools to help parcel together a pipeline, making it a more straightforward task to handle this problem, but the current gkno toolset does not currently contain tools or pipelines for methylome processing.

ADD COMMENT
0
Entering edit mode

Thanks. I will try Bismark to see what I can get.

ADD REPLY
0
Entering edit mode
8.0 years ago
Rohit ★ 1.5k

Why not try Segemehl which has modified options for Methylation data too. Also for downstream analysis, they released Metilene a few months back.

http://www.bioinf.uni-leipzig.de/Software/metilene/

ADD COMMENT

Login before adding your answer.

Traffic: 3020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6