Analyzing whole genomic sequence.
0
0
Entering edit mode
6.9 years ago
clear.choi ▴ 30

I am new for analyzing whole genomic sequences, I got de novo assembly results (fasta file format and around 4400 indexes).

This is huge file (file size is around 3GiB)

I tried to analyze this data with HG19. So I was using BWA and Blasr for alignment. but It was failed (BWA is core dump. Blasr is just never finished).

So I'd like to get some tip how I can handle this whole genomic sequence results.

Is there any recommendation to do this alignment and make bam file to see in IGV? Or any other suggestion would be thankful.

Thank you!

next-gen sequencing alignment • 1.6k views
ADD COMMENT
1
Entering edit mode

What sequencing platform is used? Illumina?

I would align the reads (fastq) from your sequencer directly to hg19 with bwa mem, instead of using a de novo assembly first.

ADD REPLY
0
Entering edit mode

It is PacBio Platform. I'd like to study how we are analyzing de novo assembly sequence results using visualization tool.

ADD REPLY
1
Entering edit mode

You could use mapPacBio.sh from BBMap suite to do the mapping and create BAM files if you wish.

Assuming your assembly has been checked and is reasonably good (did the sequence provider do that for you) then you could try using one of the tools above to see what sort of contiguity you have in the assembly with the reference. I would suggest using GRCh38 at this time since hg19 is getting a bit long in tooth.

ADD REPLY
0
Entering edit mode

Thank you so much for your information! yes, sequence provider has been checked sequence quality. I am running it right now! I will see how does it look like! And also Could you share with me normally how informatics team analyze de novo assembly results?

ADD REPLY
0
Entering edit mode

So this is human genome sequence that has been de novo assembled? When you say there are 4400 indexes does that mean there are that many contigs/sequences in your fasta file?

For long sequences like that your best bet is to use BLAST+, blat or LASTZ for doing the alignments. I am not sure why you want to make BAM files at this point.

ADD REPLY
0
Entering edit mode

Yes, human genome sequence that has been de novo assembled, and also Looks like there are many contigs/sequences in the sample fasta file.

Thank you for suggest tool ! I'd like to see sequence using visualization tool like IGV. So I wants to see how does it looks like. Like Resequencing.

ADD REPLY

Login before adding your answer.

Traffic: 1833 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6