Any packages to validate FASTA file?
1
2
Entering edit mode
5.4 years ago
khaeuk ▴ 100

I am trying to create a function that can take in a file and check to see if it's a valid fasta file or not (such as making sure there's no leading tabs or spaces, the first character starts with >, no empty lines between sequences, etc.).

I have tried using SeqIO.parse(filename, "fasta"), but it returned true for cases where it only had the description line with > and no sequence provided.

I was trying to code this, but I was wondering if there was other packages that checks validity of FASTA format?

Thanks -

python fasta • 8.1k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

If you need empty records to be considered invalid, maybe you could issue a pull request to biopython

ADD REPLY
0
Entering edit mode

You could subclass the SeqIO operations and extend the sequence checking processes for empty seqs etc?

ADD REPLY
0
Entering edit mode
8 months ago
Ankit • 0

There are two options: you can use accordingly: 1 : https://github.com/linsalrob/fasta_validator fasta_validator This is simple C code to validate a fasta file. It only checks a few things, and by default only sets its response via the return code, so you will need to check that!

  1. install fasta file validator from pip or conda.
ADD COMMENT

Login before adding your answer.

Traffic: 2141 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6