Any methods available to do QC analysis of Pacbio raw data??
3
0
Entering edit mode
6.6 years ago
karthic ▴ 130

Hi,

I am very new to NGS. Am going to have pacbio raw data in few months. What am looking for is, methods to verify and validate the raw data. I want to find out how good is my data.

Please suggest approaches/methods/tools to do so.

TIA,

KK

Assembly genome Pacbio next-gen sequencing • 8.9k views
ADD COMMENT
2
Entering edit mode

Once the data is in fastq format you could use FastQC. That said you should use the SMRTlink tools that PacBio makes available to get the full benefit of information contained in raw data. If this is going to be a small number of runs you could just ask your sequence provide to provide you with the QC results from SMRTlink. If you are going to do this regularly then it may be useful to install the software locally.

ADD REPLY
0
Entering edit mode

Thanks genomax. Yes we are trying to install SMRTlink package. However my team wants to cross validate the data with third party tools. Particularly about max and avg read length obtained, library complexity and probably coverage. We have paid for ccs reads.But as per ur suggestion fastqc should help in covering some of them.

ADD REPLY
1
Entering edit mode

I like the following for CCS reads

  • alignment then visualization
  • assembly if appropriate with Canu - which gives some great readout on the lengths, types and usefulness of the reads / library. The corrected reads which are output are also highly useful
  • stats.sh from the bbmap package to get read length statistics.

As always, conda is very useful for installation.

ADD REPLY
0
Entering edit mode

Thanks colindaven. Sorry for delayed response as I was and lying in hospital. I will try your suggestions and I may pester you with more questions.

ADD REPLY
2
Entering edit mode
6.5 years ago

If the name of the package doesn't offend your sequencer you could use my tool NanoPlot which can be used for extracting various metrics from fastq and bam files and plotting those. It's -as you might have guessed- written for Oxford Nanopore sequencing data but I can't think of a reason that it wouldn't work for PacBio.

ADD COMMENT
0
Entering edit mode

Thanks WouterDeCoster. I will use your tool and share my experiences at the earliest.

ADD REPLY
1
Entering edit mode
6.5 years ago
JstRoRR ▴ 60

I think stsPlot is an R package dedicated for this purpose only. It plots primary analysis quality control metrics. https://github.com/PacificBiosciences/stsPlots

ADD COMMENT
0
Entering edit mode

PacBio has since taken it down, but I had a clone of it and published it to keep it available: https://github.com/0xaf1f/stsPlots

ADD REPLY
1
Entering edit mode
2.9 years ago
Rox ★ 1.4k

A bit of an old post, but as I was still wondering about it today, I found this tool : LongQC . Never tried it yet, but I am going to !

ADD COMMENT

Login before adding your answer.

Traffic: 2522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6