Entering edit mode
6.9 years ago
akbioinfo14
•
0
I want to pull out the fastq reads from from the fastq file based on the list which contains header of the fastq.
Fastq.fq
@E00502:107:ZA20170508213:6:1101:4442:1836
GNGCCGCCGCAGGCTGTAACCCTTGAACATTTGGTTAAAGGTGAATGTGTCGTAGCAGTTTTAAGGATTCTTGGACGGACCGAGTATGCTCACCAACCGCGGAGACATACAATATTGTGCTATGCTAAACCGGCTCAAGGGATTGGTCCGC
+
A#A<FF<FJ<FFFJJF<7FF7<J7A<------<7-<--<<AJAJJJ7F<J-7--F7FAJJ7AF<FAA---F<A<-77-7A-A7F--AF7-7AJ-7-FJAA<F7---7-77A-----7AF-7<JF--7-A---------7-)7-7----77)
@E00502:107:ZA20170508213:6:1101:4827:1836
CNGCTGGGGGAATCTCGGTTGATTTCTTTTAATAGGGGTACTTAGATGTTTAAATTCCCCCGGTTCGCCTCATTACCCTATGGAATCAGTTAATGATAGTGTATCGAAACAAACTGGGTTTCCACATTAAAACAACGTCGGCAATAACGGT
+
A#AA<A--AA<J-FJ-77-FA<<FJ7A-F7--<-77AAA<JJ<-<JFF7F<-<--F<FFA7AAA-<-AJA77-<--<-AA7AAF-F-AFFAAJFJJ--A-<F-7<F7FF<F-7-7---77--7--7----7--A-A--A)F)))7
Header.lst
@E00502:107:ZA20170508213:6:1101:4442:1836
Expected Output
@E00502:107:ZA20170508213:6:1101:4442:1836
GNGCCGCCGCAGGCTGTAACCCTTGAACATTTGGTTAAAGGTGAATGTGTCGTAGCAGTTTTAAGGATTCTTGGACGGACCGAGTATGCTCACCAACCGCGGAGACATACAATATTGTGCTATGCTAAACCGGCTCAAGGGATTGGTCCGC
+
A#A<FF<FJ<FFFJJF<7FF7<J7A<------<7-<--<<AJAJJJ7F<J-7--F7FAJJ7AF<FAA---F<A<-77-7A-A7F--AF7-7AJ-7-FJAA<F7---7-77A-----7AF-7<JF--7-A---------7-)7-7----77)
Command used:
./seqtk subseq Fastq.fq Header.lst > test.fq
I used seqtk tool which is not working for this. Kindly suggest new method or tool to get the expected output.
duplicate of:
How To Efficiently Parse A Huge Fastq File? How To Extract Set Of Reads From Fastq (Or Eventually Fasta And Qual) Based On List Of Ids? How To Extracting Fastq Sequence For Given Fastq Ids And Fastq File Extracting A Subset Of Sequences From A Fastq File (Biopython Speed)
seqtk is not working. so i repeated this question here. is there any perl one liner command ??