Weird fastq format
1
1
Entering edit mode
7.0 years ago
tw30282 ▴ 20

Hi all,

I downloaded a RNAseq raw data and found it in weird format. The data is generated by ABI_SOLID platform. Data ID is SRR179593.fastq and here are a few lines inside it:

@SRR179593.1 mendel_20110320_FRAG_BC_Ryan_RNA_Seq_2_58_240/1
T.20.2312.0100.0111200312.0.1300.0.01.0202.0.100020
+
!!<:!7<=)!>/?:!78;-.A6-3%!&!(*?%!0!%,!.*%9!0!)5;1%+
@SRR179593.2 mendel_20110320_FRAG_BC_Ryan_RNA_Seq_2_58_300/1
T.30.2010.0313.1231232001.1.1112.0.12.0210.0.120003
+
!!9%!>48'!9:.%!7;%058A2%:!5!%<>;!+!-.!+'*'!)!',,)&+
@SRR179593.3 mendel_20110320_FRAG_BC_Ryan_RNA_Seq_2_58_638/1
T.23.1101.3222.2301222112.2.1332.2.22.2212.1.002023
+
!!-'!'%3(!'%3'!1%).).1)%'!5!%2-7!'!'5!%%0-!%!(.'(%+

I tried to map it to the genome using STAR:

#!/bin/bash
cd .
/usr/local/star/2.4.1c/bin/STAR --runThreadN 4 \
--genomeDir /escratch4/tw30282/tw30282_Mar_17/STAR/ENSEMBL.homo_sapiens.release-75 \
--readFilesIn SRR179593.fastq.gz \
--outFileNamePrefix Wei_NPC_rep1_STAR_SE. \
--outSAMtype BAM SortedByCoordinate \
--readFilesCommand zcat

But an error returned and said input file in wrong format.

Is there anyway to transform or fix the format?

Thanks, Tianming

fastq format RNA-Seq sra • 1.5k views
ADD COMMENT
7
Entering edit mode
7.0 years ago
mastal511 ★ 2.1k

ABI SOLiD data is in colorspace rather than basespace, which is why you have 0123 instead of ACGT for the sequence. You need to use an aligner that works with SOLiD data. Normally, if you weren't getting the data from SRA the SOLiD data would have 2 separate files, one with the sequence reads and one with the read qualities.

Tophat2 works with colorspace data, you have to specify the options that your data and genome index are in colorspace, but I don't see anything in the Hisat manual so I don't think it does.

ADD COMMENT

Login before adding your answer.

Traffic: 1851 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6