Your biggest mistake was probably that your records contain '=' on every line, but only '\n=\n' is a record separator. Using the command 'wc' or '--files cat' is great for debugging that kind of problems.
Your second mistake is that --block-size defaults to 1M: So the first instance may simply gobble up everything.
This ought to work (untested, as I have neither access to fasta.p3in nor to primer3):
cat fasta.p3in | parallel -N1 --round-robin --pipe --recend "\n=\n" --cat /Tools/primer3/primer3-2.3.6/src/primer3_core > fasta.p3out
You can possibly leave out --cat if primer3 reads from STDIN. If GNU Parallel takes up significant time, increase -N1: With 40000 records it is probably OK to split on bigger chunks than 1 record.