I am kindly requesting for assistance in Parsing the FASTA headers in some sequences that are too long. My objective is to load them in a database and i want to reduce the size of headers some of which are as long as three lines. Below is an example of the sequence headers i would like to parse. I would like a perl script that can work on like 1000 sequences
>NCLIV_004380 | Neospora caninum | Cathepsin L, related | genomic | NCLIV_chrIb reverse | (geneStart+0 to geneEnd+0) | length=2793
ATGGACAACAGTGAGACGCACTACGTCTCCTTCCTCAACGGCGAGGGCGACGACGGATTG
GAGAACGGCGAGCTCCACCAGCGACGAGGCGTCCGAGCCGGCGGCGTGGCTGCAACTCCC
TACGTAGTAACGACTCGGACGTACTTTTGGAAGAAATTCCTGCGTCAGCGCAACTTTAAA
ACTCGGGCCTGGATCGCACTCGTAGCAGCGGCTGTGTCTCTCCTTGTCTTTGCCTCCTTC
CTCATTCAGTGGCAGGGAGATGACGATCGGGGTGTTTTCCCGCCGTCACCAGTCGAGGAC
CACAAAACCCCGGTGAACATCTGGGAGTGGAAAGAAGAACACTTCCAGAACGCCTTCGGC
AGCTTCCGAGCCACCTACGGCAAGAGCTACGCGACCGAGGAGGAGACACAAAAGCGATAT
GCCATCTTCAAAAACAACCTCGCTTACATCCACACGCACAACCAACAAGGTGAGCAGGCA
GCCCGTACGTGTGTGTGAAGACGTGCAAGAGAGTAAAAGACGTGAAGCGTCTCTTGAAGA
ATCACAGAGATGTGCCTTGCCGGCACGTGGCAGGCTTCTAGCCCACTTCATGGTACTGGG
AGAGCGAAACCGAGGCGTACGTACACCCGCCTCGCTGAACGGCACATGTCAACCGCTACG
TTTTTTTCCCGTACCACCCAGCAAACACTCTTTTCGGCGAGTCCTGGACTGCGCGTGAGG
TCAGCGCCTGTAAAAACTTGGGACTGTCTGCGTGGAGAGGATGCCCGGGTTTTCCTCGTT
GCTGCTGGGTCTTTGCTTTTTTTCTAATCGAAGCGCGCTGTTTGTCGGTGCAACGTCACT
TGCAGGGTATTCGTACTCGCTCAAGATGAACCATTTCGGGGACTTGTCTCGCGAAGAGTT
TCGCCGCAAGTACCTCGGCTACAACAAGTCTCGCAATCTCAAGTCCAACAACCTGGGGGT
ACGTGTTTCAACTCACTCTTTCGTGGCAGGACGCACGACGGAAAAAATGTGCGGTTCGGC
TTCACCATTTGGGGAAGGCATCAGCTGGCCTGACACGCGCGTCTCACGGGGGAGACACTT
GTCTTGGAACTGCCGATCACATTCCTCTTTCCACTGCGGATTCACATACACATGTCCCAA
CATCTAGATCTTCTTTGATCGTGGATATATGTCTATCCCTCTCGCTGTAAAACCGTGGAA
GCGTTGGTGCTCATGCGTGCGAGCGTGCGCATCTCCGCGTTTTTGTGTTTGCATGGTCGT
GTTGAGTTTTTAGCTCTTCTTCTCTCGCTCGCCGTTTCGTTCTCCCCGTGCACACACACT
TCCTTCTCTCTCTTTTCTCTCTCAGCATGCATGTGGAGAGTCAGTAGACGCGTCGATACC
TTCTCTTCCTTTTCCGTCACCAAAAAGCCACCAACACACGTGCACTCGCGTTCCGGGGCG
TCTTCGTGTCTGTCTCTCTCTGTGTCCCCTCTGAGCGTTTGCGTGTGTGTTCTCGTCCTC
TCTCTTCTTCTTGGTGGAAGCGTTTCTTTTCGGTTGCTTTCTTGTGTGTAAACGCGTTGC
TCTTTTGTGTCGCTCTCTCGCTCTTCGTTCCGACGCTTTGGCGTTTCCCAGGTGGCGACG
GAGCTGTTGAAGGTTTCGCCGAGCGACGTGCCGTCTGCGGTGGACTGGCGAGAGAAGGGG
TGTGTGACGCCTGTGAAGGATCAGCGCGATTGCGGCTCGTGCTGGGCGTTCTCCGCCACG
GGAGCGTTGGAGGGAGCGCACTGCGCGAAGACTGGCGAGTTGCTGAGTTTGAGCGAACAG
GAGTTGGTGGATTGCTCACTTGCAGAAGGCAACCAAGGCTGCAGCGGCGGAGAGATGAAC
GACGCGTTCCAGTACGTCGTGGACAGCGGCGGCCTGTGCTCGGAGGAGGGGTATCCGTAC
CTCGCTCGAGACGGAGAGTGCAAGAGGGCGTGCAAGAAAGTGGTGACCATCTCGGGGTTC
AAGGACGTGCCGAGGAAGAGCGAGACGGCGATGAAGGCCGCGCTGGCTCACAGCCCCGTG
AGCATCGCGATCGAGGCCGATCAGCTGCCTTTCCAGTTCTACCACGAAGGCGTTTTCGAC
GCGTCCTGCGGAACCGACCTCGACCACGGCGTCCTCCTCGTGGGCTACGGAACCGACAAA
GAGACAAAGAAAGACTTCTGGATCATGAAAAACTCCTGGGGATCAGGATGGGGCCGAGAC
GGGTACATGTACATGGCCATGCACAAGGGCGAAGAGGTAAGCCAGTCGAGAGGAGACGCG
CGCGACAGGAGGGGGGAAGAAGGGAGGGGGGCGGTATATAGAGGGGAGGGGAGATGGAGA
GGAGGGAAACGTAGAGAAACAAAGAGGCAGAGATTTGAGGCGGCACACAATCAGAGGAGG
GGCGCGTGACATCTGGAGAGAGCCGTTTGTGCGCTTGTGTTTTCAGGGACAGTGCGGCCT
TCTTTTGGATGCGTCTTTCCCCGTGGTGTAGAAGCGCCCAAGACGCGATCTCGAAAAAGG
GTTCGAGAAGCGAGACGCGGAGGACGTTGGTGCTCCGAGAAGAAAGAGGGACGTGGCCGG
GAATTGAAAGGGAGGAAAAAATGCAGGGGAGGCAAGAGAAGCGAGAGAGCGAAGTGGAGC
GGCCGGCTCACCCAGGAAGAGAGGAAAAAGGGAGAACCGATTTGAAAGGGAGAGAAAGGA
GAAAAATAGAGAGAGGCGTGCCACGCGATGTCCTCGTTTCGTGGTTTTTGTTCATGTTCT
GTCCCGCCGCAGACTGTGCCTTTGGACTTTTGA
>TGGT1_064570 | Toxoplasma gondii GT1 | cysteine protease, putative | genomic | TGGT1_chrIb reverse | (geneStart+0 to geneEnd+0) | length=3478
GAACATTGGCTTTTTTCATCCTTACCCCGGAACCTGGACGTTCTTTTTGGCCCTCCCCGG
CTCCCGAAAAAGAGCCTGCCCCCCAGCGGCGCGTCGGTTCGCGGGTTTCCCCCGTACTTT
TCTTATTTCGCAGAGTTTTGACCGTTTGTCCCTCTTACCCCGCGGCCTCGCTCTCGAACA
CCATGGACAGCAGCGAGACGCACTACGTCTCCTTCCTCAACGGCGAGGACGACGGGCTGG
AGAACGGCGAGCTTCACCAGCGACGAGGCGTCCGCGCTGGCCGGCCGAGTCCTCCGTTCG
TTGTTACGACCCGCACGTACTTCTGGAAGAAATTTCTCCGTCAACGCAACTTCACAGCGC
GGGCGTGGATCGCACTCGTGGCTGCGGCGGTGTCTCTCCTTGTCTTTGCCTCGTTCCTCA
TTCAGTGGCAGGGCGAAGACGACCGCGCGGTCTTCCCGCCGTCGCCAGTTGAGGACCACC
AACCCCCCGCCAACATCTGGGAATGGAAAGAGGCGCACTTTCAGGACGCCTTCAGCAGCT
TCCAAGCCATGTACGCCAAGAGCTACGCGACTGAGGAAGAAAAACAGAGGCGATACGCCA
TCTTCAAAAACAACCTCGTCTACATTCACACACACAACCAACAAGGTAGGCACTCGCGCA
GTGTGCGTGGTGGGAGGGGCCGCAAACTTTCTTCTTCTGCGCTCGTTAATGTGCCGACAG
AACTCTCAGTGAACAAAAGTATTCCATAGTTCGAGGACACACAAAACTCCTCGAAACTGC
GGGCTGGACGTCGTGTCTGGAGTCCATCTTTGCATCAAAAAGTTGAGGCTCGGTGCATCT
GAAGAGGCGTTCTCATGCAATGCGGAGCCTAGAACATATACTCGATGAGGAGAAACCGCG
GAGTCTCAACGTCAGCTCTGCAGTGCAGCGGATAATAAGTATCCATTTTCAATGCGATTT
CGTCCAAGACACGGTGTATGATCTACATGTTCCGAATGCCGGGGGGGGCACTGGAGGATA
CGGCAACTTCTTTAGACCAATCTGCGCTGGGGGTCGGAAGTCGGGCTCCCGAGAATCAAC
GCAGTCTCCTGTTTCCGACACCCTTGCAGTGAAGTCGCTTGGTCTGATCTGTGAACTTCG
ACAAACATGGCGTGGATTTTTTTTCCGCTGTCCCCCGACACGAACGGTGCTCCTTTCGCA
CTCTTCTGACTGAACGCGTCAACCGGCAGGTTGAAGACGAGCGCGGCTTCCGTTTTTCCG
CGCCTTTTCTCATTCGTTTCACTTCCTCCGAGGCATCTTTAGGTGAAGCGAACGCGTCTC
TTTCTGTTTCGAGGGAGACGTTTGGACGCGCGACTTTTTTCGTGAGTTCTGCCGGGTTGT
GTGTGGCTACTGAACACATTCCCTGCGTTGAGAGAAGGTGCTCGGTTTTCGTGGTTGTTA
GATCGGCCTTGCATCGAAAGACTCGGTTTCCCTCCGCCTGCATGCAGGATATTCCTACTC
CCTCAAGATGAATCACTTTGGAGACTTGTCGCGCGACGAGTTCCGCCGGAAGTACCTCGG
CTTCAAGAAGTCCCGCAACCTGAAGTCGCACCACCTTGGGGTAAGTGGCTTGGGTTTGTT
CTCGATCAGCCGTGCGACGGAAATGGTTTCTGCTCTGTACTTCTTGATCAGCAGTGCGAC
GGAAATGGTTTCTGCTCTGTACGGTCCGAAGCGGTCGAGGAAAGAGATGGTGGATTCAGT
GTCTGCAGGTGTCGTCGCGGGGAAGAGCTTCCCAGACATATCTGCGTTAGTGGAAGCGTG
GACTCGGCGACGAGAGGGAGATGTTGATTCACTGGAGAAGAAGATGGCAGTCAAACTTGG
GGTGTGACCTGCCGCTGAAGTGGCGACGGTCTCCAACTCGTGGAGTCCGACTTCGTACGA
CTGAGATGAGATCCACTTTGACAGCTTTTCTTTGGTGTCTTTGTCTTTGTTTTGGTCCGT
GGTTTTTGTTTCCAGCTGCGCGTCAAACACACTGAGTCGAGAGGGGTGTTTCTGTTCCCG
GCACACCTTGTTTTTGATCCACTTCACCTTTTGGTACCCTTTCGTCTTTTTTTTTCTGTC
TTGTTCTCTGCGTTCCTCTTTCTCCTTCTCTATTTCTTCTCTCCCCTCTACGTTTTTTTC
TCTCAGCCCCTCTCTGCCTTTCTCCCGCTTCCCCTGCGCACTTTCCATCCGCTCTGCTTC
CCAGTCTCGTTCCCCCGCGTTTCCTTCTTCTCCCGTTTTTTGTCCGTGTTCCGGTTCCGG
TTCCTGCTCCTGTTCCTGTTCCGGTTCGTTTTCTTCCTTCCGTCTCGCTCTCCTCTTACC
GGGCTCCTGCCGCGCGACGCGCTCCGCTGTCTCCCGCGCTGCTTCCTGTGTCAGGTGGCG
ACGGAGTTGCTGAATGTACTGCCAAGTGAACTGCCTGCTGGAGTGGACTGGCGCTCGCGC
GGATGCGTAACGCCGGTGAAGGACCAGCGAGACTGCGGCTCTTGCTGGGCGTTCTCGACC
ACAGGGGCTCTCGAGGGCGCGCACTGCGCAAAGACGGGCAAGCTGGTGAGTTTGAGTGAG
CAGGAGCTGATGGACTGCTCGCGAGCAGAGGGCAACCAGAGTTGCAGTGGCGGCGAGATG
AACGACGCGTTTCAGTACGTCTTGGACAGCGGCGGGATCTGCTCGGAGGATGCGTATCCG
TACCTTGCCCGAGACGAGGAGTGTCGAGCGCAGAGCTGTGAGAAAGTCGTGAAGATCTTG
GGCTTCAAGGACGTACCACGGAGAAGCGAGGCTGCGATGAAGGCCGCTCTCGCGAAGAGT
CCAGTGAGCATCGCCATCGAAGCCGACCAGATGCCTTTCCAGTTCTACCACGAGGGAGTC
TTTGACGCGTCTTGTGGCACAGACCTCGACCATGGCGTCCTCCTCGTCGGATACGGAACG
GACAAAGAGTCGAAGAAGGACTTCTGGATCATGAAAAACTCCTGGGGCACCGGCTGGGGC
AGAGACGGATACATGTACATGGCCATGCACAAAGGCGAAGAGGTGAGCTGAGAACGAATG
AAGGAGAACGAGGAGCGAAGGCGTGGAGAACTCAGGACAAGATCTCCAGGAGGCAAAAGA
GAGACGCAAACGAAGGAAGAGGAACGGGTGGGAGAAGACGAAGGGAAAAAAGTCAGGTGG
AAGAAAGAACGCATGCGGGAGAAGAAGGAGGGGACCAGGACGAGGGGAAGACATGTTTTG
AAAAATCCGAGGAAGAGCAGGGGAACAAGGAGAGAGGAGGGCGGAACTGCGGTCCGAGAG
GATGCGCGCGCGGGCGTGTGATGCGCGTGCTTGTGGTACGAAAAACGATTGGTCTGGAGA
GAAAGGAACACAGGTATCGGACTACATTTGTGTCTGGTATCAAACGCGTTCTCCTTTTTT
TGTGTGTCTGCAGGGGCAGTGCGGCCTTCTCTTAGATGCGTCTTTCCCCGTGATGTGA
>TGVEG_059000 | Toxoplasma gondii VEG | cysteine protease, putative | genomic | TGVEG_chrIb reverse | (geneStart+0 to geneEnd+0) | length=3434
GAACATTGGCTTTTTTCATCCTTACCCCGGAACCTGGACGTTCTTTTTGGCCCTCCCCGG
CTCCCGAAAAAGAGCCTGCCCCCCAGCGGCGCGTCGGTTCGCGGGTTTCCCCCGTACTTT
TCTTATTTCGCAGAGTTTTGACCGTTTGTCCCTCTTACCCCGCGGCCTCGCTCTCGAACA
CCATGGACAGCAGCGAGACGCACTACGTCTCCTTCCTCAACGGCGAGGACGACGGGCTAG
AGAACGGCGAGCTTCACCAGCGACGAGGCGTCCGCGCTGGCCGGCCGAGTCCTCCGTTCG
TTGTTACGACCCGCACGTACTTCTGGAAGAAATTTCTCCGTCAACGCAACTTCACAGCGC
GGGCGTGGATCGCACTCGTGGCTGCGGCGGTGTCTCTCCTTGTCTTTGCCTCGTTCCTCA
TTCAGTGGCAGGGCGAAGACGACCGCGCGGTCTTCCCGCCGTCGCCAGTTGAGGACCACC
AACCCCCCGCCAACATCTGGGAATGGAAAGAGGCGCACTTTCAGGACGCCTTCAGCAGCT
TCCAAGCCATGTACGCCAAGAGCTACGCGACTGAGGAAGAAAAACAGAGGCGATACGCCA
TCTTCAAAAACAACCTCGTCTACATTCACACACACAACCAACAAGGTAGGCACTCGCGCA
GTGTGCGTGGTGGGAGGGGCCGCAAACTTTCTTCTTCTGCGCTCGTTAATGCGCCGACAG
AACTCTCAGTGAACAAAAGTATTCCATAGTTCGAGGACACACAAAACTCCTCGAAACTGC
GGGCTGGACGTCGTGTCTGGAGTCCATCTTTGCATCAAAAAGTTGAAGCTCGGTGCATCT
GAAGAGGCGTTCTCATGCAATGCGGAGCCTAGAACATATACTCGATGAGGAGAAACCGCG
GAGTCTCAACGTCAGCTCTGCTGTGGAGCGGATAATAAGTATCCATTTTCAATGCGATTT
CGTCCAAGACACGGTGTATGATCTACATGTTCCGAATGCCGGGGGGGGGGGGGCACTGGA
GGATACGGCAACTTCTTTAGACCAATCTGCGCTGGGGGTCGGAAGTCGGGCTCCCGAGAA
TCAACGCAGTCTCCTGTTTCCGACACCCTTGCAGTGAAGTCGCTTGGTCTGATCTGTGAA
CTTCGACAAACATGGCGTGGATTTTTTTTCCGCTGTCCCCTGACACGAACGGTGCTCCTT
TCGCACTCTTCTGACTGAACGCGTCAACCGGCAGGTTGAAGACGAGCGCGGCTTCCGTTT
TTCCGCGCCTTTTCTCATTCGTTTCACTTCCTCCGAGGCATCTTTAGGTGAAGCGAACGC
GTCTCTTTCTGTTTCGAGGGAGACGTTTGGACGCGCGACTTTTTTCGTGAGTTCTTCCGG
GTTGTGTGTGGCTACTGAACACATTCCCTGCGTTGAGAGAAGGTGCTCGGTTTTCGTGGT
TGTTAGATCGGCCTTGCATCGAAATACTCGGTTGTCCTCCGCCTGCATGCAGGATATTCC
TACTCCCTCAAGATGAATCACTTTGGAGACTTGTCGCGCGACGAGTTCCGCCGGAAGTAC
CTCGGCTTCAAGAAGTCCCGCAACCTGAAGTCGCACCACCTTGGGGTAAGTGGCTTGGGT
TTGTTCTCGATCAGCCGTGCGACGGAAATGGTTTCTGCTCTGTACGGTCCGAAGCGGTCG
AGGAAAGAGATGGTGGATTCAGTGTCTGCAGGTGTCGTCGCGGGGAAGAGCTTCCCAGAC
ATATCTGCGTTAGTGGAAGCGTGGACTCGGCGACGAGAGGGAGATGTTGATTCACTGGAG
AAGAAGATGGCAGTCAAACTTGGGGTGTGACCTGCCGCTGAAGTGGCGACGGCCTCCAAC
TCGTGGAGTCCGACTTCGTACGACTGAGATGAGATCCACTTTGACAGCTTTTCTTTGGTG
TCTTTGTCTTTGTTTTGGTCCGTGGTTTTTGTTTCCAGCTGCGCGTCAAACACACTGAGT
CGAGAAGGGTGTTTCTGTTCCCGGCACACCTTGTTTTTGATCCACTTCACCTTTTGGTGC
CCTTTCGTCTTTTTTTTCTGTCTTGCTCTCTGCGTTCCGCTTTCTCCTTCTCTATTTCTT
CTCTCCCCTCTACGTTTTTTTCTCTCAGCCCCTCTCTGCCTTTCTCCCGCTTCCCCTGTG
CACTTTCCATCCGCTCTGCTTCCCAGTCTCGTTCCCCCGCGTTTCCTTCTTCTCCCGTTT
TTTGTCCGTGTTCCTGTTCCGGTTCCTGTTCCGGTTCCGGTTCGTTTTCTTCCTTCCGTC
TCGCTCTCCTCTCACCGGGCTCCTGCCGCGCGACGCGCTCCGCTGTCTCCCGCGCTGCTT
CCTGTGTCAGGTGGCGACGGAGTTGCTGAATGTACTGCCAAGTGAACTGCCTGCTGGAGT
GGACTGGCGCTCGCGCGGATGCGTAACGCCGGTGAAGGACCAGCGAGACTGCGGCTCTTG
CTGGGCGTTCTCGACCACAGGGGCTCTCGAGGGCGCGCACTGCGCAAAGACGGGCAAGCT
GGTGAGTTTGAGTGAGCAGGAGCTGATGGACTGCTCGCGAGCAGAGGGCAACCAGAGTTG
CAGTGGCGGCGAGATGAACGACGCGTTTCAGTACGTCTTGGACAGCGGCGGGATCTGCTC
GGAGGATGCGTATCCGTACCTTGCCCGAGACGAGGAGTGTCGAGCGCAGAGCTGTGAGAA
AGTCGTGAAGATCTTGGGCTTCAAGGACGTACCACGGAGAAGCGAGGCTGCGATGAAGGC
CGCTCTCGCGAAGAGTCCAGTGAGCATCGCCATCGAAGCCGACCAGATGCCTTTCCAGTT
CTACCACGAGGGAGTCTTTGACGCGTCTTGTGGCACAGACCTCGACCATGGCGTCCTCCT
CGTCGGATACGGAACGGACAAAGAGTCGAAGAAGGACTTCTGGATCATGAAAAACTCCTG
GGGCACCGGCTGGGGCAGAGACGGATACATGTACATGGCCATGCACAAAGGCGAAGAGGT
GAGCTGAGAACGAACGAAGGAGAAAGCGGAGCGAAGGCGTGGAGAACTCAGGACAAGATC
TCCAGGAGGCAAAAGAGAGACGCAAACGAAGGAACAGGAACGGGTGGGAGAAGACGAAGG
AAAAAAAGTCAGGTGGAAGGAAGAACGCATGCGGGAGAAGAAGGAGGGGACCAGGACGAG
GGGAAGACATGTTTTGAAAAATCCGAGGAAGAGCAGGGGAACAAGGAGAGAGGAGGGCGG
AACTGCGGTCCGAGAGGATGCGCGCGCGGGCGTGTGATGCGCGTGCTTGTGGTACGAAAA
ACGATTGGTCTGGAGAGAAAGGAACACAGGTATCGGACGACATTTGTGTCTGGTATCAAA
CGCGTTCTCCTTTTTTTGTGTGTCTGCAGGGGCAGTGCGGCCTTCTCTTAGATGCGTCTT
TCCCCGTGATGTGA
>TGME49_121530 | Toxoplasma gondii ME49 | cathepsin L-like thiolproteinase, putative | genomic | TGME49_chrIb reverse | (geneStart+0 to geneEnd+0) | length=3430
GAACATTGGCTTTTTTCATCCTTACCCCGGAACCTGGACGTTCTTTTTGGCCCTCCCCGG
CTCCCGAAAAAGAGCCTGCCCCCCAGCGGCGCGTCGGTTCGCGGGTTTCCCCCGTACTTT
TCTTATTTCGCAGAGTTTTGACCGTTTGTCCCTCTTACCCCGCGGCCTCGCTCTCGAACA
CCATGGACAGCAGCGAGACGCACTACGTCTCCTTCCTCAACGGCGAGGACGACGGGCTGG
AGAACGGCGAGCTTCACCAGCGACGAGGCGTCCGCGCTGGCCGGCCGAGTCCTCCGTTCG
TTGTTACGACCCGCACGTACTTCTGGAAGAAATTTCTCCGTCAACGCAACTTCACAGCGC
GGGCGTGGATCGCACTCGTGGCTGCGGCGGTGTCTCTCCTTGTCTTTGCCTCGTTCCTCA
TTCAGTGGCAGGGCGAAGACGACCGCGCGGTCTTCCCGCCGTCGCCAGTTGAGGACCACC
AACCCCCCGCCAACATCTGGGAATGGAAAGAGGCGCACTTTCAGGACGCCTTCAGCAGCT
TCCAAGCCATGTACGCCAAGAGCTACGCGACTGAGGAAGAAAAACAGAGGCGATACGCCA
TCTTCAAAAACAACCTCGTCTACATTCACACACACAACCAACAAGGTAGGCACTCGCGCA
GTGTGCGTGGTGGGAGGGGCCGCAAACTTTCTTCTTCTGCGCTCGTTAATGCGTCGACAG
AACTCTCAGTGAACAAAAGTATTCCATAGTTCGAGGACACACAAAACTCCTCGAAACTGC
GGGCTGGACGTCGTGTCTGGAGTCCATCTTTGCATCAAAAAGTTGAGGCTCGGTGCATCT
GAAGAGGCGTTCTCATGCAATGCGGAGCCTAGAACATATACTCGATGAGGAGAAACCGCG
GAGTCTCAACGTCAGCTCTGCTGTGGAGCGGATAATAAGTATCCATTTTCAATGCGACTT
CGTCCAAGACACGGTGTATGATCTACATGTTCCGAATGGCGGGGGGGGGCACTGGAGGAT
ACGGCAACTTCTTTAGACCAATCTGCGCTGGGGGTCGGAAGTCGGGCTCCCGAGAATCAA
CGCAGTCTCCTGTTTCCGACACCCTTGCAGTGAAGTCGCTTGGTCTGATCCGTGAACTTC
GACAAACATGGCGTGGATTTTTTTTCCGCTGTCCCCCGACACGAACGGTGCTCCTTTCGC
ACTCTTCTGACTGAACGCGTCAACCGGCAGGTTGAAGACGAGCGCGGCTTCCGTTTTTCC
GCGCCTTTTCTCATTCGTTTCACTTCCTCCGAGGCATCTTTAGGTGAAGCGAACGCATCT
CTCTCTGTTTCGAGGGAAACGTTTGGACGCGTGACTTTTTTCGTGAGTTCTGCCGGGTTG
TGTGTGGCTACTGAACACATTCCCTGCGTTGAGAGAAGGTGCTCGGTTTTCGTGGTTGTT
AGATCGGCCTTGCATCGAAAGACTCGGTTTTCCTCCGCCTGCATGCAGGATATTCCTACT
CCCTCAAGATGAATCACTTTGGAGACTTGTCGCGCGACGAGTTCCGCCGGAAGTACCTCG
GCTTCAAGAAGTCCCGCAACCTGAAGTCGCACCACCTTGGGGTAAGTGGCTTGGGTTTGT
TCTCGATCAGCCGTGCGACGGAAATGGTTTCTGCTCTGTACGGTCCGAAGCGGTCGAGGA
AAGAGATGGTGGATTCAGTGTCTGCAGGCGTCGTCGCGGGGAAGAGCTTCCCAGACATAT
CTGCGTTAGTGGAAGCGTGGACTCGGCGACGAGAGGGAGATGTTGATTCACTGGAGAAGA
AGATGGCAGTCAAACTTGGGGTGCAACCTGTCGCTGAAGTGGCGACGGGCTCCAACTCGT
GGAGTCCGACTTCGTACGACTGAGATGAGATCCACTTTGACAGCTTTTCTTTGGTGTCTT
TGTCTTTGTTTTGGTCCGTGGTTTTTGTTTCCAGCTGCGCGTCAAACACACTGAGTCGAG
AAGGGTGTTTCTGTTCCCGGCACACCTTGTTTTTGATCCACTTCACCTTTTGGTGCCCTT
TCGTCTTTTTTTTCTGTCTTGCTCTCTGCGTTCCTCTTTCTCCTTCTCTATTTCTTCTCT
CCCCTCTACGTTTTTTTCTCTCAGCCCCTCTCTGCCTTTCTCCCGCTTCCCCTGCGCACT
TTCCATCCGCTCTGCTTCCCAGTCTCGTTCCCCCGCGTTTCCTTCTTCTCCCGTTTTTTG
TCCGTGTTCCGGTTCCTGCTCCTGTTCCTGTTCCTGTTCGTTTTCTTCCTTCCGTCTCGC
TCTCCTCTCACCGGGCTCCTGCCGCGCGACGCGCTCCGCTGTCTCCCGCGCTGCTTCCTG
TGTCAGGTGGCGACGGAGTTGCTGAATGTACTGCCAAGTGAACTGCCTGCTGGAGTGGAC
TGGCGCTCGCGCGGATGCGTGACGCCGGTGAAGGACCAGCGAGACTGCGGCTCTTGCTGG
GCGTTCTCGACCACAGGGGCTCTCGAGGGCGCGCACTGCGCAAAGACGGGCAAGCTGGTG
AGTTTGAGTGAGCAGGAGCTGATGGACTGCTCGCGAGCAGAGGGCAACCAGAGTTGCAGT
GGCGGCGAGATGAACGACGCGTTTCAGTACGTCTTGGACAGCGGCGGGATCTGCTCGGAG
GATGCGTATCCGTACCTTGCCCGAGACGAGGAGTGTCGAGCGCAGAGCTGTGAGAAAGTC
GTGAAGATCTTGGGCTTCAAGGACGTACCACGGAGAAGCGAGGCTGCGATGAAGGCCGCT
CTCGCGAAGAGTCCAGTGAGCATCGCCATCGAAGCCGACCAGATGCCTTTCCAGTTCTAC
CACGAGGGAGTCTTTGACGCGTCTTGTGGCACAGACCTCGACCACGGCGTCCTCCTCGTC
GGATACGGAACGGACAAAGAGTCGAAGAAGGACTTCTGGATCATGAAAAACTCCTGGGGC
ACCGGCTGGGGCAGAGACGGATACATGTACATGGCCATGCACAAAGGCGAAGAGGTGAGC
AGAGAACGAACGAAGGAGAAAGCGGAGCGAAGGCGTGGAGAACTCAGGACAAGATCTCCA
GGAGGCAAAAGAGAGACGCAAACGAAGGAAGAGGAACGGGTGGGAGAAGACGAAGGAAAA
AAAGTCAGGTGGAAGAAAGAACGCATGCGGGAGAAGAAGGAGGGGACCAGGACGAGGGGA
AGACATGTTTTGAAAAATCCGAGGAAGAGCAGGGGAACAAGGAGAGAGGAGGGCGGAACT
GCGGTCCGAGAGGATGCGCGCGCGGGCGTGTGATGCGCGTGATTGTGATACGAAAAACGA
TTGGTCTGGAGAGAAAGGAACACAGGTATCAGACGACATTTGTGTCTGGTATCAAACGCG
TTCTCCTTTTTTTGTGTGTCTGCAGGGGCAGTGCGGCCTTCTCTTAGATGCGTCTTTCCC
CGTGATGTGA
>TVAG_003650 | Trichomonas vaginalis G3 | Clan CA, family C1, papain-like cysteine peptidase | genomic | DS113306 reverse | (geneStart+0 to geneEnd+0) | length=933
ATGATTGCTGCTCACGAGGAAAAAAGTTTCGTCGGCTGGATGCGTGAAAACAGCATGATG
TTCACTGGTGATGAATATCATTTCAGGTTAGGTGTTTGGTTATCAAATAAAAGATATGTT
CAACAACATAACGCCGGAAATTCCCATTTCATTCTTGCAATGAACAGATTATCACATTTA
ACACCAAGTGAATACCATTCGTTGCTTGGTTACAAGAATTCTGGAAGGAATCATAAGTAT
TCCATTATTAACGAGAAAAACGCAGGCTCACATCCAGATTCATTCGATTGGAGAGATCAC
CCAGGAGTTATTGGTCCAGTCAAAGACCAAAATGACTGTGGTTCTTGCTGGGCTTTCTCT
ACAATCTTTGGATTAGAGTCAAATTGGGCTGTTAAACATAATGCCGCCTATATATTATCA
GAACAAAACCTTGTTGACTGCTGCAGTAGTGCAGCTGGATGCAATGGTGGTTTTCCAGCT
GATGCATGGGATTGGATGATTGATGAACAAGGAGGCAAGACAATGCTAGAAGTAGATTAC
CCATATACATCTCAGGAAGGAACTTGCAAATGGAACAAGAAGAAGGCCGCTCCACCACAA
GTTAAAGGATACGTCGAAGTTGCCGATTGTGATGAAAACGATTTAGCAGAGAAGATTGCA
AACTTAGGTCCAGCAAGTATTGCAATTGATGCTTCACTTTATTCCTTCATGATGTACCAG
TCAGGAATTTATGATGATCCAAAGTGCAGCTCAATGAACTTAGATCATGCTGTTGGTTGC
GTTGGTTATGGTGTTGAGAATGGCGCTAAATATTGGATTGTTCGCAATTCATGGGGCGAA
ATGTGGGGCGAAAAAGGTTACATTCGTATGGCAAGAGACAAACATAACCAATGCGGTGTT
GCAACAGAAGCATTCATCGCTCAGGTTAATTAA
>TVAG_034140 | Trichomonas vaginalis G3 | Clan CA, family C1, papain-like cysteine peptidase | genomic | DS113428 reverse | (geneStart+0 to geneEnd+0) | length=982
ATGTTTGCTTTCGCTGCACTTTCCCGCTCAGTGTTGACAACACAGGCAGAAGAGAAGGCC
TTCCTTGACTGGATGCGCCAGACAAACAACATTTTTGTTGGCGAAGAATTCCATTTCAGA
AAGGGAATTTTCCTCACACACAAGCGTTTCGTTGAAGAGCACAACAGAAAGTCATCTTTC
CGCTGCGGCCTTAACCAGTTCGCTCACCTTACACCATCCGAATACCAAGAACTCCTCGGC
TACAAGCAAATGAAGCAGAAAGAAGAAGTTGAGTTCGCTCCACTTAAGAACTTCAACGCT
CCAGACAGCTTCGACTGGCGTGAGAAGGGCGTTGTCAACGCCATCAAGGATCAGGGCCAG
TGCGGTTCCTGCTGGGCTTTCTCATCCATTCAGGCTCAAGAATCACAGTGGGCTATCCAC
CACCCAGGAGAGCTCTACGACCTTGCTGAGCAGCAGCTCGTTGACTGCGTTCACGACTGC
TTCGGCTGCAACGGTGGAAACGTTGGCTGGGCTTACACATGGGTAAAGCTTTTCGAGCAC
GGCATGTTCATGTTACAGAAGGACTACCCATACACAGCTAAGGATGGCAAGTGCGCTTTC
GACAAGTCTAAGGGCATCACAAAGATCACAACACACAAGAAGGCTTCACACGATGAAGAG
GCTCTCAAGACATCCGTTGCAGAAAACGGACCACACGCTATCGCTATCGATGCTGGCCAC
GATTCCTTCATGATGTACGAATCTGGTGTTTACGAGGATGCTTCCTGCTCTTCTTCTACA
CTCGACCACGCTGTCGGCCTTGTTGGCTACGGTGTTGACGGAGACAAGGACTTCTGGCTC
GTCCGTAACTCATGGTCCACAACATGGGGTGAACAGGGCTACGTCAGAATCCGCCGTAAC
TACCACAACATGTGCGGCGTCGCTTCTGAACCAATGTTCCCAGTCGTTGAATAAAAAAAT
TAAAATTTAGGTTTCTTGCGAA
>TVAG_043620 | Trichomonas vaginalis G3 | Clan CA, family C1, papain-like cysteine peptidase | genomic | DS113505 forward | (geneStart+0 to geneEnd+0) | length=918
ATGCTTGCTCAACAGTACGAATATAAAGCATTCCTTGGATGGATGAGAGAAACAGGAAAC
ATGTTTACAGGCGATGAATATCATTCGCGCTTTGGAATATGGCTTTCAAATAAGCGTTTC
GTTCAAAATCACAATCGTGCAAATCTCGGATTCACACTTGCACTTAACAAGCTTGCTTAT
TTATCACCAACTGAGTACAAAGCCATGCTCGGATTCCATAATAAAGGAGTTCATAACATT
GCCATCAAATCCAATACAATTGCTAATGATGAAATCGACTGGCGTTCCAAGGGAGTCGTC
AATCCTATCCAGCATCAACAACATTGCGGATCATGCTGGGCATTTTCTGCAATCCAAGCT
CAAGAATCCCAATATGCAATCACATATGGCAAACTCCAAAAGCTTTCAGAGCAGAACCTT
GTTGATTGCGTTACCTCATGCAATGGATGTCACAATGGTTTAATGTCAGCCGCATACGAT
TACGTAATTCAATACCAGGGTGGAAAGTTCATGCTGGAAACGGATTATCCATACACAGCA
GTTGAAGGAACATGCAAATTCAACCAAGCAAAGGCTACTTCGAAGATTGTTTCATACATT
AATGTTGTTGAAGGGGATGAAAAAGATCTTGCTGCAAAGGTTTCCGCTTACGGTCCATCA
ACTGTTGGCATTGATGCATCCCATTATACATTCCAACTTTACTCACATGGCATTTACGAC
GAACCACACTGCTCATCTTTCTCTCTAAATCACGGTGTTGGCTGCGTTGGCTATGGCACA
GAAGGCACAAAGAATTACTGGATTGTCCGTAACTCCTGGGGTCTTGAATGGGGCGAGCAA
GGCTATGTCCGCATGATCAAGGACAAGAACAATAATTGTGGTATTGCTACAGTTGCATGC
ATCCCCATTGATAAGTAA
sounds like a homework. I'm sure you'll find tons of examples on the web to solve this problem. What did you find so far ?
First, phrase your question more carefully. "in some sequences that are too long" makes it sound like the sequence length is the problem. Second, if your headers are as long as 3 lines, then you don't have valid FASTA, since the header has to be on the first line. Third, I don't really see why long headers are a problem either for parsers or for database storage - you won't save that much space by trimming the header.