STAR+Cufflinks for RNA-Seq analysis
1
0
Entering edit mode
5.2 years ago
wangdp123 ▴ 340

Hi there,

I am using STAR+cufflink combination to handle the unstranded paired-end RNA-Seq datasets.

  1. I wonder if there are a set of typical parameters for both STAR and cufflink?

For STAR:

STAR --runThreadN 1 --runMode alignReads --genomeDir index --readFilesIn sample_r1.fq sample_r2.fq --outFileNamePrefix sample_ --outSAMtype BAM SortedByCoordinate --outSAMattributes All --outSAMstrandField intronMotif

For cufflink:

cufflinks -p 1 -G sample.gtf -o sample_clout sample_Aligned.sortedByCoord.out.bam

By running the above two command lines, I encountered a warning message:

"Warning: Using default Gaussian distribution due to insufficient paired-end reads in open ranges. It is recommended that correct parameters (--frag-len-mean and --frag-len-std-dev) be provided."

Does this warning matter? Is anything gone wrong?

  1. I noticed from the STAR manual that for unstranded RNA-Seq data, we should give the parameter "--outSAMstrandField intronMotif" to STAR. And I tested the following three scenarios:

(1) without --outSAMattributes without --outSAMstrandField, the error message from Cufflinks: BAM record error: found spliced alignment without XS attribute (1) --outSAMattributes All --outSAMstrandField intronMotif, no error message, I can see the XS attribute in the BAM file. (1) --outSAMattributes Standard --outSAMstrandField intronMotif, no error message, I can see that there is NO XS attribute in the BAM file.

Does this mean using either "--outSAMattributes All" and "--outSAMattributes Standard" will get to the same destination? Does Cufflinks treat them in the same way?

Thanks for your help,

Tom

RNA-Seq STAR Cufflinks • 4.8k views
ADD COMMENT
0
Entering edit mode
5.2 years ago

The error has nothing to do with the strandedness of the data.

It is about the estimated fragment sizes for the read pairs. The 9th column of the BAM file contains the TLEN field, (template lenght). The --frag-len-mean and --frag-len-std-dev is asking for the mean value of TLEN and its standard deviation. I am not sure why there seem to be insufficient data there.

Either ignore the warning or figure out the mean and stdev for your data from that column.

ADD COMMENT

Login before adding your answer.

Traffic: 3861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6