Hi, I wanna extract some reads which contain my interest sequence information from fastq file, but these sequence information I interested is in the middle of a read or start of a read. How could I extract these reads?
Use bbduk.sh from BBMap suite in filter mode. User guide here. Add literal=sequence_you_are_looking.
If not only looking for exact matches, you can try seqkit grep: seqkit grep --by-seq --max-mismatch 1 --pattern "ATCGAAG" test.fq
seqkit grep --by-seq --max-mismatch 1 --pattern "ATCGAAG" test.fq
What are these sequences, and why not extract them with the grep function?
Are you are looking for an exact match of your specific sequence? What have you tried?
Login before adding your answer.