Error when running QoRTs for JunctionSeq
0
0
Entering edit mode
3.2 years ago
n.tear ▴ 80

Hi,

I am trying to run QoRTs on my STAR aligned data however I have been getting errors I dont understand. I have attempted to re-sort my files with samtools and have increased the memort to no avail.

I have included my full pipeline below:

 #STAR alignment

#Indexing

STAR --runThreadN 6 \
--runMode genomeGenerate \
--genomeDir STARindex \
--genomeFastaFiles /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
--sjdbGTFfile /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf \
--sjdbOverhang 149

#cd to location of raw data
#aligning

for f in `ls *.fq.gz | sed 's/_[12].fq.gz//g' | \
sort -u`; do echo $f && \
STAR --genomeDir STARindex \
--runThreadN 6 \
--readFilesIn <(gunzip -c ${f}_1.fq.gz ${f}_2.fq.gz) \
--outFileNamePrefix /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/${f}.sorted \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--outSAMattributes Standard; done

#sorting again to make sure

for f in `ls *.bam | sed 's/.bam//g' | \
sort -u`; do echo $f && \ 
samtools sort -@ 4 ${f}.bam -o ${f}.samtoolsort.bam;
done

#JunctionSeq
#cd to analysis output location

java -Xmx16G -jar /usr/local/QoRTs.jar QC --stranded --runFunctions writeKnownSplices,writeNovelSplices,writeSpliceExon /mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam /mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf RawCounts/

#I have done this with both ensembl and gencode alignments with and without samtools sorting with the same errors

QoRTs .log file:

Starting QoRTs v1.3.6 (Compiled Tue Sep 25 11:21:46 EDT 2018)
Starting time: (Tue Feb 16 17:38:53 GMT 2021)
INPUT_COMMAND(QC)
  INPUT_ARG(infile)=/mnt/z/NathanHaffordTear/STARalignments/STARalignment.hg38v93/C18.sortedAligned.sortedByCoord.out.samtoolsort.bam
  INPUT_ARG(gtffile)=/mnt/z/NathanHaffordTear/ReferenceGenomes/Homo_sapiens.GRCh38.93.chr_patch_hapl_scaff.gtf
  INPUT_ARG(outdir)=RawCounts/
  INPUT_ARG(stranded)=true
  INPUT_ARG(runFunctions)=List(writeKnownSplices, writeNovelSplices, writeSpliceExon)
Created Log File: RawCounts//QC.jpnPkkWzfKwU.log
Warning: run-in-progress file "RawCounts//QC.QORTS_RUNNING" already exists. Is there another QoRTs job running?
Starting QC
[Time: 2021-02-16 17:38:53] [Mem usage: [21MB / 536MB]] [Elapsed Time: 00:00:00.0001]
QoRTs is Running in paired-end mode.
QoRTs is Running in any-sorted mode.
No parameter --genomeFA found.
NOTE: Function "writeKnownSplices" requires function "JunctionCalcs". Adding "JunctionCalcs" to the active function list...
Running functions: JunctionCalcs, writeKnownSplices, writeNovelSplices, writeSpliceExon 

Checking first 10000 reads. Checking SAM file for formatting errors...
   Stats on the first 10000 reads:
        Num Reads Primary Map:    3159
        Num Reads Paired-ended:   0
        Num Reads mapped pair:    0
        Num Pair names found:     0
        Num Pairs matched:        0
        Read Seq length:          150 to 150
        Unclipped Read length:    150 to 150
        Final maxReadLength:      150
        maxPhredScore:            37
        minPhredScore:            2
   WARNING WARNING WARNING! Running in paired-end mode, but reads appear to be single-end! Errors may follow.
           Strongly recommend using the '--singleEnded' option
   Note: Data appears to be paired-ended.
   Warning: Have not found any matched read-pairs in the first 10000 reads. Is data paired-end? Is data sorted?
   Sorting Note: Reads appear to be grouped by read-pair, probably sorted by name
   Sorting Note: Reads are sorted by position 
Done checking first 10000 reads. WARNINGS FOUND!
SAMRecord Reader Generated. Read length: 150.
[Time: 2021-02-16 17:38:56] [Mem usage: [315MB / 536MB]] [Elapsed Time: 00:00:02.0624]
> Init JunctionCalcs utility
Compiling flat feature annotation, internally in memory...
FlatteningGtf: starting...(2021-02-16 17:38:56)
    FlatteningGtf: gtf file read complete.(2021-02-16 17:40:24)
    FlatteningGtf: Splice Junction Map read.(2021-02-16 17:40:26)
    FlatteningGtf: gene Sets generated.(2021-02-16 17:40:28)
    FlatteningGtf: Aggregate Sets built.
    FlatteningGtf: Compiling Aggregate Info . . . (2021-02-16 17:40:28)
    FlatteningGtf: Finished Compiling Aggregate Info. (2021-02-16 17:40:28)
    FlatteningGtf: Iterating through the step-vector...(2021-02-16 17:40:28)
    FlatteningGtf: Adding the aggregate genes themselves...(2021-02-16 17:40:30)
    FlatteningGtf: Iterating through the splice junctions...(2021-02-16 17:40:31)
    FlatteningGtf: Sorting the aggregate genes...(2021-02-16 17:40:33)
    FlatteningGtf: Folding the FlatGtfLine iterator...(2021-02-16 17:40:34)
    FlatteningGtf: Features Built.(2021-02-16 17:40:34)
Internal flat feature annotation compiled!
length of knownSpliceMap after instantiation: 393388
length of knownCountMap after instantiation: 393388
> Init MinorUtils Utility
QC Utilities Generated!
[Time: 2021-02-16 17:40:39] [Mem usage: [1914MB / 9GB]] [Elapsed Time: 00:01:46.0350]
============================FATAL_ERROR============================
QoRTs encountered a FATAL ERROR. For general help, use command:
          java -jar path/to/jar/QoRTs.jar --man
============================FATAL_ERROR============================
Error info:
Exception in thread "main" java.lang.IllegalStateException: Inappropriate call if not paired read        
        at net.sf.samtools.SAMRecord.requireReadPaired(SAMRecord.java:655)
        at net.sf.samtools.SAMRecord.getFirstOfPairFlag(SAMRecord.java:713)
        at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:1020)
        at internalUtils.commonSeqUtils$$anon$4.next(commonSeqUtils.scala:978)
        at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.next(stdUtils.scala:969)
        at scala.collection.Iterator.foreach(Iterator.scala:929)
        at scala.collection.Iterator.foreach$(Iterator.scala:929)
        at internalUtils.stdUtils$IteratorProgressReporter$$anon$14.foreach(stdUtils.scala:963)
        at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1300)
        at qcUtils.runAllQC$.run(runAllQC.scala:970)
        at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:680)
        at runner.runner$.main(runner.scala:98)
        at runner.runner.main(runner.scala)

If anyone can assist me I would be very gateful

Nathan

QoRTs JunctionSeq RNA-Seq • 643 views
ADD COMMENT

Login before adding your answer.

Traffic: 1146 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6