Hey!
I am simulating RNA-Seq data using Polyester. My goal is to evaluate the performance of different workflows (Different combinations of tools). To simulate differentially expressed reads by Polyester, I used a Fold Change Distribution of 4-fold up-regulation (3% of transcripts), 2-fold up (7%), 1.5-fold up (9%), 1.5-fold down-regulation (6%), 2-fold down (3%), and 4-fold down(2%) and the rest 70% with a Baseline expression of 1. However, I'm very curious to know if it would make a difference if I were to take a range of Fold Change values (between 1.5 to 4). I could use the 'simulate_experiment_empirical' function from Polyester instead which simulates the reads based on a real dataset to define the counts. But, I'm concerned that it would add bias based on the tool that I use to calculate the abundances. Thanks in advance!
Thank you for your reply. I am considering Simulated data for additional evaluation. I will also be using Benchmarking data and RT-qPCR info alongside the simulated data for evaluation.
I have several combinations of tools and it's very tedious to make several simulations.