Question: How to extract reads which contain specific sequence from fastq file
0
Entering edit mode

Hi, I wanna extract some reads which contain my interest sequence information from fastq file, but these sequence information I interested is in the middle of a read or start of a read. How could I extract these reads?

Thank you!

ADD COMMENTlink 7 months ago hsu • 0 • updated 7 months ago WouterDeCoster 39k
Entering edit mode
2

Use bbduk.sh from BBMap suite in filter mode. User guide here. Add literal=sequence_you_are_looking.

ADD REPLYlink 7 months ago
genomax
68k
Entering edit mode
2

If not only looking for exact matches, you can try seqkit grep: seqkit grep --by-seq --max-mismatch 1 --pattern "ATCGAAG" test.fq

ADD REPLYlink 7 months ago
SMK
♦ 1.3k
Entering edit mode
1

What are these sequences, and why not extract them with the grep function?

ADD REPLYlink 7 months ago
darbinator
• 180
Entering edit mode
0

Are you are looking for an exact match of your specific sequence? What have you tried?

ADD REPLYlink 7 months ago
WouterDeCoster
39k

Login before adding your answer.

Powered by the version 1.8