In-silico downsizing to estimate the DNA input
1
0
Entering edit mode
3.1 years ago
APJ ▴ 40

Hi,

Given a fastq file from 50ng data, I could find all the reference variants from the variant calling results. Is it possible to test in silico downsizing of fastq data, to see what the minimal DNA amount would be to not lose our reference variants?

Any thoughts on this?

Thank you!

sequencing snp next-gen • 563 views
ADD COMMENT
1
Entering edit mode

I guess all you can do is check how coverage differences change variant calls, but I doubt that you can meaningfully simulate different DNA amounts as this is dependent on the kit and the number of PCR cycles, so you would need data for different starting amounts and then make a model based on these data.

ADD REPLY
0
Entering edit mode
3.1 years ago
5heikki 11k

Why not?

You can do e.g. this:

paste -d $'\t' - - - - <file.fq | shuf -n "$NUMBER" | awk 'BEGIN{FS="\t";OFS="\n"}{print $1,$2,$3,$4}' > out.fq

Where "$NUMBER" is the number of reads you want in your output. If you want the shuf to be deterministic or e.g. have the chance to including the same read more than once then see man shuf

ADD COMMENT

Login before adding your answer.

Traffic: 2962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6