Question

Common reads between two fastq files

0

Entering edit mode

5.3 years ago

Inquisitive8995 ▴ 270

Hello,

I have two fastq files which look something like this:

File 1:

>@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+
:=<;1?)07?<A7AA#############################################################

> @SRR596683.238/1
CGAAAGCATCATAATCAGGAGTAAGACGAACATATGCCTTCTCTTTATTAGGTCAAATCATGGTGATGATCATTGC
+
1++?AA+<=?+?7=<,2++<+3<<=+?C0=4ABBB<=ABBA9?ABBBA############################

File 2:

> @BADLQCSRR596683.54 54 length=76
TTCAGCGTGTTAACATATTTGAAGTGCTTAAAAATGAGGCTTTTGTCCAGGGATTAATGAGTGAATACAAAAATTG
+SRR596683.54 54 length=76
############################################################################
> @BADLQCSRR596683.96 96 length=76
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+SRR596683.96 96 length=76

I want to take the common reads between both the files. E.g., SRR596683.96 is common. I tried using grep -Fwf and -Fxf but did not get the results.

I want the output file to look like this:

@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA

Thanks in advance. Any help would be appreciated.

exome sequence Assembly • 2.0k views

ADD COMMENT • link updated 5 weeks ago by Ram 43k • written 5.3 years ago by Inquisitive8995 ▴ 270

score 4 · Answer 1 · 2019-01-04

4

Entering edit mode

5.3 years ago

finswimmer 16k

Hello,

I hope the > in the sequence id aren't there, otherwise these are not valid fastq files.

Assuming you have valid fastq files, you can use seqkit common for your task.

$ seqkit common file1.fastq file2.fastq -s -i|seqkit fq2fa

fin swimmer

ADD COMMENT • link 5.3 years ago by finswimmer 16k