Pooled sequencing DNA barcode space and deconvolution
1
1
Entering edit mode
3.9 years ago
vinaykusuma ▴ 10

Hello,

I was reading through https://www.biorxiv.org/content/10.1101/2020.04.06.025635v1.full.pdf

which happens to be a pooled sequencing method using barcodes to test about 10000 covid samples at one go.

I came across compressed DNA barcoding space and DNA barcoding deconvolution for the first time and need some help understanding it.

Although, I searched about it on internet i couldn't find any information on it.

I will highly appreciate if someone can help me with explanation.

Thanks.

genome sequence next-gen gene barcode • 1.4k views
ADD COMMENT
3
Entering edit mode
3.9 years ago
sysboolean ▴ 90

The general idea behind pooled sequencing is that we sequence N samples with X barcodes where X << N.

A major cost in NGS is uniquely barcoding each sample. For each unique barcode, a primer of ~ 60 - 90 bases (depending on design) needs to be synthesized and purified, and roughly adds a cost of ~ $1 - 2 per sample. For sequencing a large number of samples at a time to make full use of the sequencing capacity, say 10,000 samples per day like in Covid-19 testing, we need to uniquely barcode each sample so that we can identify each sample post sequencing. So now you can see the problem in terms of cost. Ordering 10,000 barcoded primers is going to cost several hundred thousand dollars and managing the workflow is going to be a non-trivial.

However, if you have add multiple barcodes to each sample, now you can uniquely tag each sample with a smaller set of barcodes. For example, with 10 barcodes and uniquely adding 5 barcodes to each sample, you can individually barcode ~ 30,000 samples (use permutation formula as barcode order also matters n! / (n-r)! ; n = 10, r = 5). Now you have drastically reduced the cost of ordering barcode primers. Sure, you are using more of each barcode primer but ordering a few barcode oligos in bulk is cheaper and makes managing the workflow easier.

Sample 1 gets barcodes B1,B2,B3,B4,B5

Sample 2 gets barcodes B1,B2,B3,B4,B6 and so on.

There are other ways to find the identities of N samples using X barcodes/NGS libraries where X << N, but here, we make pools of samples so that each sample is distributed over a unique set of pools and after sequencing the pool, we solve the sample IDs based on the pools in which the samples occurred. See the following papers for some simple examples.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6134198/

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5109470/

ADD COMMENT
0
Entering edit mode

Thanks a lot! That makes sense. I understand compressed barcode space now. Do you know what barcode deconvolution is ? Is it another word for demultiplexing using barcode information?

ADD REPLY
1
Entering edit mode

Barcode deconvolution in this context is another way to say demultiplexing. In examples like the papers I linked, we need to solve a systems of equations to get the sample identity. I don't think deconvolution is a good term to use here as deconvolution has a specific meaning in mathematics.

ADD REPLY

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6