Hi All,
Thanks for the suggestions. I used the code from these posts to convert the fasta file into fastq.
- convert FASTA into FASTQ using linux
- BioPython: convert fasta to fastq without quality score input file
Here is a part of the original fasta file:
>cel1_count=3
TGCCTTGTCTGTCCTAAAAATC
>cel2_count=9
GTTAAGTGGGAAACGATGT
>cel3_count=7
CCGACCTTGAAATACCAC
>cel4_count=7
TAGAAATCCACTATGCTTTGG
>cel5_count=5
CGCGGGTGAGCAGCCTGGTAGCTCGTC
And the resulting fastq file:
@cel1_count=3
TGCCTTGTCTGTCCTAAAAATC
+
IIIIIIIIIIIIIIIIIIIIII
@cel2_count=9
GTTAAGTGGGAAACGATGT
+
IIIIIIIIIIIIIIIIIII
@cel3_count=7
CCGACCTTGAAATACCAC
+
IIIIIIIIIIIIIIIIII
@cel4_count=7
TAGAAATCCACTATGCTTTGG
+
IIIIIIIIIIIIIIIIIIIII
@cel5_count=5
CGCGGGTGAGCAGCCTGGTAGCTCGTC
+
IIIIIIIIIIIIIIIIIIIIIIIIIII
Using the fastq file in miRExpress, the tool returned an error: Illegal Character: IIIIIIIIII
Using the original fasta file in mirdeep2, the tool also returned an error: the first line of the file is not in accordance with fasta specification. Make sure that the file is according to specifications and does not contain whitespaces.
Is there a way to ensure that the original fasta file is checked and marked to fasta specification? I am hoping that may solve the Illegal Character issue.
Thanks!