Preprocessing for duplicate sequences & primer sequences from SMARTSeq v4 kit
0
0
Entering edit mode
6.2 years ago
nmbaran • 0

Hi All!

I have just received raw sequencing reads from an RNAseq experiment which used antibody-based RNA immunoprecipitation (IP) and I have some questions about quality control, trimming, etc. My goal is to compare gene expression patterns between the input mRNA and immunoprecipitated mRNA. Because the IP results in very low RNA yield, I use the Takara Bio/Clontech SMARTSeq v4 kit to reverse-transcribe the mRNA to cDNA, which uses an PCR amplification step to generate enough cDNA to make libraries. Both input and IP samples went through the same number of PCR cycles in this step (18, the maximum recommended). I then used the NEBNext Ultra II kit to make the libraries, which includes another 8x PCR cycles to amplify the adapted ligator DNA.

As you might expect, this results in a high rates of sequence duplication in my raw reads. In addition, I can see that some of the over-represented duplicate sequences include the SMARTSeq CDS Primer II A sequences.

What is the best way to go about the preprocessing here? I'm concerned that indiscriminate de-duplification will introduce bias into the data that would change the input/IP comparison. For the CDS II primer sequences, should I be able to trim them like I would adapters and, if so, is that wise?

For what its worth, I'm planning to use the new "Tuxedo" pipeline (HiSat2, StringTie, Ballgown, etc.) downstream.

I've also reached out to Takara Bio/Clontech technical support, but I thought I would ask the community, too!

Thanks, in advance, for your help!

Nicole (Georgia Tech)

P.S. I'm still quite new to bioinformatics analyses...

RNA-Seq duplification SMARTSeq Preprocessing • 2.3k views
ADD COMMENT
0
Entering edit mode

Before you take any additional measures do the normal analysis (QC, trimming (some don't do this if you are using STAR), alignment and analysis) first to see what the results look like. If you notice strangeness in the results then you can take a look at duplicates and/or backtrack to check on intermediate steps.

ADD REPLY

Login before adding your answer.

Traffic: 1805 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6