Biostar Beta. Not for public use.
Manual Assembly of genome portion
0
Entering edit mode
13 months ago
JulianC • 10

Hi guys!

I have a whole genome paired ends sequencing data, divided into R1 and R2. I do not have the reference for the organism I am working on (only a close related one), and I am trying to manually assemble a part of the genome, the centromere, using the paired ends reads I have. I know the starting point and starting from a certain read, I want to elongate it in order to assemble that portion. Could you give me advices for this operation? I could take a part of the read and using grep command search it in the whole genome data, but I am not sure this is a way. Automatic assemblers such as Spades don't work because my genome is a large eukaryotic genome. Thank you!

Assembly • 152 views
ADD COMMENTlink
0
Entering edit mode

Sorry to say, but manual assembly is probably not feasible. My advice is to use an assembler like velvet, MIRA or trinity. See how far you can get with these tools.

ADD REPLYlink
1
Entering edit mode
13 months ago
h.mon 25k
Brazil

If your are working on a large eukaryotic genome, most likely the centromere is composed repetitive sequences and spans from several thousand base pairs to some million base pairs. With current sequencing technology (even long reads like PacBio or Nanopore), it is a very, very hard task to assemble centromeres. It doesn't matter you have a starting point, because very quickly you will be picking up the repetitive sequences, which will be virtually impossible to correctly assemble.

You can see how far you can go with Tadpole, a genome assembler and part of the BBTools package, see the thread Extending ends of sequences with the help of reads? for details.

ADD COMMENTlink
0
Entering edit mode

Thank you very much for your advice h.mon, I will see that

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1