Question

Analyzing Older Illumina Paired-End Data

0

Entering edit mode

10.7 years ago

danrdanny ▴ 70

Hi all, I've got some reads from mid-2010 that I'd like to re-align. I'm not sure how best to proceed. These are from a Illumina Genome Analyzer IIx. They're 40-bp, paired-end. Here's what the text file of the raw reads looks like:

@ILLUMINA-B22060_0001:6:1:2:1931#0/2
ATTGATTCGTCACGCAGATTTCGAAACATTAAATGCAATC
+ILLUMINA-B22060_0001:6:1:2:1931#0/2
`aWabGSb`^`b^a]aaaY_\aaUDUaaG[`^a\Sa[a`B
@ILLUMINA-B22060_0001:6:1:2:545#0/2
CATCGCCGATCAGGTCACTTACCCGGAGAATTTTGATAGG
+ILLUMINA-B22060_0001:6:1:2:545#0/2
X__]^V_baYO\V\DYYPX\``]BBBBBBBBBBBBBBBBB
@ILLUMINA-B22060_0001:6:1:2:1500#0/2
TGGAGGGGAGCAAAGAACCGAAGCTGAAGTTGACTTTCTT
+ILLUMINA-B22060_0001:6:1:2:1500#0/2
]_aGab`abZ[bbbaa`a^a[`a^\a\aYDH\a_IRSYX]

I'm assuming this isn't a standard fasta/fastq format, but I could be wrong. I'd like to align these with bowtie. Is there a simple way to convert these to fastq or fasta?

Thanks in advance.

illumina paired-end • 1.9k views

ADD COMMENT • link updated 10.7 years ago by SES 8.6k • written 10.7 years ago by danrdanny ▴ 70

1

Entering edit mode

I reformatted your post, hope you don't mind. The data was stretched out in a single line it does not appear that's what you wanted. If this format change reflects what your data looks like then I don't see anything unusual about this fastq, and you can try the things I mentioned in my answer below. If you see something wrong then please change it back or clarify what you think is unusual about the data.

ADD REPLY • link 10.7 years ago by SES 8.6k

score 1 · Answer 1 · 2013-07-29

1

Entering edit mode

10.7 years ago

SES 8.6k

This is fastq data you have here. It looks like a sample of the reverse reads from a paired-end run. One way to check the encoding, if you are unsure, is to run your data through FastQC. If it's from an older run, you can use the option --phred64-quals with bowtie.

ADD COMMENT • link 10.7 years ago by SES 8.6k