Analyzing Older Illumina Paired-End Data
1
0
Entering edit mode
10.7 years ago
danrdanny ▴ 70

Hi all, I've got some reads from mid-2010 that I'd like to re-align. I'm not sure how best to proceed. These are from a Illumina Genome Analyzer IIx. They're 40-bp, paired-end. Here's what the text file of the raw reads looks like:

@ILLUMINA-B22060_0001:6:1:2:1931#0/2
ATTGATTCGTCACGCAGATTTCGAAACATTAAATGCAATC
+ILLUMINA-B22060_0001:6:1:2:1931#0/2
`aWabGSb`^`b^a]aaaY_\aaUDUaaG[`^a\Sa[a`B
@ILLUMINA-B22060_0001:6:1:2:545#0/2
CATCGCCGATCAGGTCACTTACCCGGAGAATTTTGATAGG
+ILLUMINA-B22060_0001:6:1:2:545#0/2
X__]^V_baYO\V\DYYPX\``]BBBBBBBBBBBBBBBBB
@ILLUMINA-B22060_0001:6:1:2:1500#0/2
TGGAGGGGAGCAAAGAACCGAAGCTGAAGTTGACTTTCTT
+ILLUMINA-B22060_0001:6:1:2:1500#0/2
]_aGab`abZ[bbbaa`a^a[`a^\a\aYDH\a_IRSYX]

I'm assuming this isn't a standard fasta/fastq format, but I could be wrong. I'd like to align these with bowtie. Is there a simple way to convert these to fastq or fasta?

Thanks in advance.

illumina paired-end • 1.9k views
ADD COMMENT
1
Entering edit mode

I reformatted your post, hope you don't mind. The data was stretched out in a single line it does not appear that's what you wanted. If this format change reflects what your data looks like then I don't see anything unusual about this fastq, and you can try the things I mentioned in my answer below. If you see something wrong then please change it back or clarify what you think is unusual about the data.

ADD REPLY
1
Entering edit mode
10.7 years ago
SES 8.6k

This is fastq data you have here. It looks like a sample of the reverse reads from a paired-end run. One way to check the encoding, if you are unsure, is to run your data through FastQC. If it's from an older run, you can use the option --phred64-quals with bowtie.

ADD COMMENT

Login before adding your answer.

Traffic: 1943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6