Ideal data amounts and read depth generated by MinION sequencer
1
0
Entering edit mode
5.1 years ago
xusijiamed • 0

Hey guys, I'm new to nanopore sequencing and have many silly questions. I really need your knowledge. I use MinION sequenced a library containing DNA molecules mainly about 10kbp long, the flowcell I used is an old one, with about 700 active pores according to the platform check. I ran the device for about 4hrs, and got 15Gb fast5 and 1.5Gb fastq data. I read that a flowcell could be used 48hrs or more and generate 10~30Gb data (Is that mean the fast5 data) ideally. This time, half amount of active pores, 4-hr-run, 15Gb data?? BTW, the data quality control by MinIONQC shows the quality score of most data are >7, lookes acceptable... 1. Is this normal? 2. If my purpose is to cofirm some pathogenic variants in one target gene, how much read depths usually needed. I think that matters the sequencing time. This time I chose to run 4hrs but without any idea about this. Could anyone give some suggestion? Thank you very much!

nanopore sequencing MinION fast5 data amount • 4.0k views
ADD COMMENT
1
Entering edit mode
5.1 years ago

A better place for these question might be the nanopore community forum, assuming you have access. This is also not exactly bioinformatics, and in that sense, SeqAnswers could be more appropriate, but I don't know for sure if you'll get an answer there.

Anyway.

I use MinION sequenced a library containing DNA molecules mainly about 10kbp long

Did you use enrichment, or is this just genome sequencing? If the latter, what's the genome size?

The flowcell I used is an old one, with about 700 active pores according to the platform check.

That's indeed a suboptimal flowcell, but might be enough to generate data for your experiment.

I ran the device for about 4hrs

Any reason why you stopped then? Except if you use barcodes or wash the flow cell you're better off to keep one sample per flowcell (to avoid contamination).

and got 15Gb fast5 and 1.5Gb fastq data. I read that a flowcell could be used 48hrs or more and generate 10~30Gb data (Is that mean the fast5 data) ideally.

No, now you are confusing gigabytes with gigabases. The size of your files (in bytes) is not important. The amount of data, expressed in (giga)bases is what you should look at.

the data quality control by MinIONQC shows the quality score of most data are >7, lookes acceptable... 1. Is this normal?

7 is the standard cut-off from ONT to consider reads "good" or "not good", but depending on your application also low-quality reads may be valuable.

  1. If my purpose is to cofirm some pathogenic variants in one target gene, how much read depths usually needed.

If you want to confirm variants which you expect to be there then I would consider about ~20x coverage to be enough. If you are doing de novo variant calling without knowing what to expect you would need more.

ADD COMMENT
0
Entering edit mode

Thank you so much for your detailed answers. Very helpful. I'll post my questions in right places next time. Still learning about the forum. To supplement, I use enrichment for single gene. The whole gene is > 40kb, but I had difficulties in primer-design, so I seprate them. I stopped sequencing it because I was not sure how much data I need for analysis. And the number of avtive pore started to drop. Which brings another question, if the number of active pore drops to, like 50% or lower, is it better I stop and load with new library (the same sample origin) or just keep going? Thank you again.

ADD REPLY
0
Entering edit mode

if the number of active pore drops to, like 50% or lower, is it better I stop and load with new library (the same sample origin) or just keep going?

It is hard to comment on that. The price of a flow cell depends on the number you buy simultaneously, a single library always costs you $100. You can reload with a new library, and that's what some people do (for example on PromethION). What also can help is to reload fuel mix, without necessarily adding a new library. I'm not sure if I'm allowed to elaborate on this, so the nanopore community forum might be the best place for questions about this. It is useful to monitor the yield of your run while it's ongoing to determine if you have enough. If you are targetting a 40kb region and want 100x coverage you only need 4Megabases of sequencing data (don't look at the file size). The MinKNOW interface should contain all the information you need.

We generally fully use a flow cell until it's entirely dead, as we want to exclude cross-contamination and don't commonly use barcodes.

ADD REPLY

Login before adding your answer.

Traffic: 1578 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6