How do you get rid of reads with matching pattern in fastq
2
0
Entering edit mode
5.7 years ago
MAPK ★ 2.1k

I need to get rid of all the reads with 3' prime A in a fastq file and get the new fastq without them. How ccould you acheive this ?

fastq • 1.5k views
ADD COMMENT
0
Entering edit mode

Just to be clear you are only talking about discarding reads where the last base is A. Can you modify the answers from your question yesterday to start thinking about how you can do this: Command to count reads in fastq file with last bases

Curious as to why you want to do this.

ADD REPLY
0
Entering edit mode
  1. Why do you want that ; I am really curious!

  2. By that, do you mean you don't want reads like these

    AAAGTACGATCACTACTACATC

    AAGTACGATTAACTACTACATC

    AGTGTACGGGGATCACTACTAC

But these will be okay?

AAAGTACGATCACTACTACATC
AAGTACGATTAACTACTACATC
AGTGTACGGGGATCACTACTAC
ADD REPLY
0
Entering edit mode

I want reads like: AATTTATATGGGAGCCAC But not: GATTAGGGCCGCGGGATA

I need to analyze small RNA structures so need to do this with reads not ending with A

ADD REPLY
3
Entering edit mode
5.7 years ago
GenoMax 141k

See if this does the trick:

cat your.fastq | paste - - - - | awk -F '\t' '{if ($2 !~/A$/){ print $0}}'| tr "\t" "\n" > filtered.fastq
ADD COMMENT
0
Entering edit mode

This is lovely, wish I had known this construct much earlier: cat your.fastq | paste - - - -

ADD REPLY
1
Entering edit mode

@Pierre uses it often. I would not be surprised if he is the creator.

ADD REPLY
0
Entering edit mode

do you mean paste? genomax

ADD REPLY
0
Entering edit mode

@cshu181 was asking about the cat | paste construct.

ADD REPLY
0
Entering edit mode

Ending it with | tr "\t" "\n" is kind of an important part too.

ADD REPLY
0
Entering edit mode
5.7 years ago

@ MAPK Posting example data and expected output would help people better answer your query.

try seqkit grep -vsrip "a$" example.fastq. Seqkit can be downloaded from here

ADD COMMENT

Login before adding your answer.

Traffic: 1522 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6