Removing small contigs from fasta files
2
0
Entering edit mode
4.4 years ago
Nyksubuz ▴ 10
## removesmalls.pl
#!/usr/bin/perl
use strict;
use warnings;

my $minlen = shift or die "Error: `minlen` parameter not provided\n";
{
    local $/=">";
    while(<>) {
        chomp;
        next unless /\w/;
        s/>$//gs;
        my @chunk = split /\n/;
        my $header = shift @chunk;
        my $seqlen = length join "", @chunk;
        print ">$_" if($seqlen >= $minlen);
    }
    local $/="\n";
}

Exexecuting the script as follows:

perl removesmalls.pl 1000 contigs.fasta > contigs-1000.fasta

The above script works for me but there is a problem, i have 109 different fasta files with different file names. i can run the script for individual file but i want to run the script at once for all files and the result file should be individually different for each.

file names are like SRR8224532.fasta, SRR8224533.fasta, SRR8224534.fasta, and so on i want the result files after removing the contigs (i.e., for me less than 1000) something like SRR8224532-out.fasta, SRR8224533-out.fasta, and so on.

Any help or suggestion would be helpfull.

Assembly perl fasta contigs • 2.3k views
ADD COMMENT
1
Entering edit mode

Write a loop function!

ADD REPLY
5
Entering edit mode
4.4 years ago
liorglic ★ 1.4k

You have two options:
1. Change the script so it can loop on your list of files - shouldn't be too hard.
2. Use some bash tricks to run your script as is on all your files. This is assuming you're on some kind of Unix OS (Linux, MacOS etc.) Then you can do something like this (from your command line):

files=("file1.fasta" "file2.fasta" "file3.fasta")
for f in "${files[@]}"; do echo $f; perl removesmalls.pl 1000 $f > "$f""gt1000"; done
ADD COMMENT
0
Entering edit mode
for i in *.fasta; 
    do perl removesmalls.pl 1000 $i > ${i%.fasta}-out.fasta; 
done

i think even this will work

ADD REPLY
1
Entering edit mode
4.1 years ago
Bioinfo ▴ 20

Hello One option is to use reformat.sh from the bbmap package reformat.sh in=contigs.fasta out=filtered.fasta minlength=200

Good luck !

ADD COMMENT

Login before adding your answer.

Traffic: 1808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6