Common reads between two fastq files
1
0
Entering edit mode
5.3 years ago

Hello,

I have two fastq files which look something like this:

File 1:

>@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+
:=<;1?)07?<A7AA#############################################################

> @SRR596683.238/1
CGAAAGCATCATAATCAGGAGTAAGACGAACATATGCCTTCTCTTTATTAGGTCAAATCATGGTGATGATCATTGC
+
1++?AA+<=?+?7=<,2++<+3<<=+?C0=4ABBB<=ABBA9?ABBBA############################

File 2:

> @BADLQCSRR596683.54 54 length=76
TTCAGCGTGTTAACATATTTGAAGTGCTTAAAAATGAGGCTTTTGTCCAGGGATTAATGAGTGAATACAAAAATTG
+SRR596683.54 54 length=76
############################################################################
> @BADLQCSRR596683.96 96 length=76
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA
+SRR596683.96 96 length=76

I want to take the common reads between both the files. E.g., SRR596683.96 is common. I tried using grep -Fwf and -Fxf but did not get the results.

I want the output file to look like this:

@SRR596683.96/1
TTGGGGGCTGTGACTGAAGAGAGTGACAGATCAATGAGCGAGTGGATGGCTAGCAGGAAGAACACGGGAGAGAGAA

Thanks in advance. Any help would be appreciated.

exome sequence Assembly • 2.0k views
ADD COMMENT
4
Entering edit mode
5.3 years ago

Hello,

I hope the > in the sequence id aren't there, otherwise these are not valid fastq files.

Assuming you have valid fastq files, you can use seqkit common for your task.

$ seqkit common file1.fastq file2.fastq -s -i|seqkit fq2fa

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 1800 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6