Biostar Beta. Not for public use.
Fasta to FastQ with known qualities
0
Entering edit mode
12 months ago

Hi everyone,

I am working on aDNA and I try to simulate some of those reads. So at some point, I have reads into a FastQ file, with qualities, and I run them through Gargammel (a software to simulate aDNA damage). However this software take as input a Fasta file and output another Fasta file with the updated reads.

My question is: I want my FastQ back so, so far, I was using BBMap to go back to Fasta BUT I was putting dummies qualities to all nucleotides of all reads. I wanna know if there is an easy way to recover my previous FastQ qualities and add them to my new created Fasta sequences and create another FastQ file with those sequences and qualities ?

I can do it with by creating my own made script but pretty sure there is an easier/quicker way to do it.

Thanks a lot,

ADDNOTHIING

ADD COMMENTlink
0
Entering edit mode

Are you going to make a change to Q-score where a nucleotide was updated or you don't care about that? @Bastien's solution should work if you don't.

ADD REPLYlink
0
Entering edit mode

For now I don't really want to change the quality score. That might (or might not) be the next step tho ! Thank you, I will try his solution.

ADD REPLYlink
0
Entering edit mode

with seqkit and join:

$ join -1 1 -2 1 -t $'\t' <(seqkit fx2tab test.fa) <(seqkit fx2tab test.fq) -o 2.1,1.2,2.3 | seqkit tab2fx

output:

@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGaaaaaaaaaaaa
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC

input:

$ cat test.fa
>SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGaaaaaaaaaaaa 

$ cat test.fq
@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=36
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACC
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9IC
ADD REPLYlink
3
Entering edit mode
13 months ago
Limoges, CBRS, France

Hello ADDNOTHIING

Is it what you are looking for ? If you have solo line fasta this should work otherwise a python solution may be more suitable

cat reads.fa | sed 's/>/@/g' | paste - - <(seq -w 1 $(grep -c ">" reads.fa) | xargs printf '+\n%.s') <(awk 'NR % 4 == 0' reads.fq) | sed 's/\t/\n/g' > new.fq
ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1