filter_fasta.py not removing sequences from fastq based on read IDs
0
0
Entering edit mode
5.8 years ago
fjs5035 • 0

I'm attempting to use filter_fasta.py in macqiime to remove sequences from a fastq based on a .txt file of read IDs.

filter_fasta.py -f my_reads.fastq -o filtered_reads.fq -s read_ids.txt -n

The input file has 366000 reads. The output file is 366000 reads. Nothing is being removed. I ensured the read IDs are actually represented in the fastq with grep. My read ID .txt file has only one ID per line. Any ideas what could be wrong?

sequencing sequence fastq qiime • 1.7k views
ADD COMMENT
2
Entering edit mode

Hi fjs5035

I just added a hyperlink to the script you are referring to.

Additionally, could you add the outputs of following commands, it will you get quick answers

  1. output few lines from your id file

    head -n 10 read_ids.txt

  2. output few read headers

    grep "^@" my_reads.fastq | head -n 10

ADD REPLY
0
Entering edit mode

There is also a very useful tool in bbMap for your requirement. In case your issue persists, give a try with filterbyname.sh in bbMap suite of tools available at link.

ADD REPLY
1
Entering edit mode

Just a hunch, does your read_ids.txt contain the "@" at the start of the fastq identifiers?

ADD REPLY
0
Entering edit mode

It does. I take it from your comment that that shouldn't happen.

ADD REPLY
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 1921 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6