Computing A Short Read'S Score
2
1
Entering edit mode
12.0 years ago
Raygozak ★ 1.4k

Hi, if i have the following read:

 @HWI-ST261:7:1:16113:8289#0/1
.CTCGGATAACCGTAGTAATTCTAGAGCTAATACGTGCAACAAACCCCGACTTCCGGGAGGGGCGCATTTATTAG
+
BWURTY[YYVacaccccccca_ccc\cc_ccccccccac_c_aYVcccc_c_accc_ZccUUUUVYV_c_[ZVV^

Where the 4th row is the score. as far as i understand each base has a probability which is interpreted as the quality, how can I compute the overall sequence read quality and be able to say the quality of the read is >50% for example. i know there several types of scores.

Thanks

next-gen • 2.6k views
ADD COMMENT
1
Entering edit mode
12.0 years ago
brentp 24k

With just the sequence, you can calculate the number of Q30 or Q20 bases -- the number of bases with scores greater than 30 and 20 respectively. So you could say X% of the bases have a quality score greater than 20. Once you have done the mapping, then you'll also get a mapping quality which is a single value that indicates the quality of the mapping.

ADD COMMENT
1
Entering edit mode
12.0 years ago
JC 13k

The overall quality score can be calculated with the average of the individual base scores (in this case is Phred+64 scale), but this average only represents how "good" is your read and can be biased if the Illumina's pipeline used "chastity" (you will see a run of B's close to the end).

To understand the scale, please check: http://en.wikipedia.org/wiki/Phred_quality_score and http://en.wikipedia.org/wiki/FASTQ_format

ADD COMMENT

Login before adding your answer.

Traffic: 2916 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6