Entering edit mode
10.5 years ago
edwardhust
•
0
i use bwa tool to map raw pair end reads to the reference, then output two sai file. then, i use bwa sampe to convert two sai file to one sam file, it took a long time; finally, i got my result, but there are unexpected messy codes in the QUAL field.
my command follows:
nohup bwa sampe -f /home/liucj/projects/TWINS/WGC007813/WGC007813.sai.sam \
-r "@RG\tID:WGC007813\tLB:WGC007813\tSM:WGC007813\tPL:ILLUMINA" \
/home/liucj/data/ReferAll/index/ucsc.hg19 \
/home/liucj/projects/TWINS/WGC007813/WGC007813_1.fq.sai \
/home/liucj/projects/TWINS/WGC007813/WGC007813_2.fq.sai \
/home/liucj/data/SAMPLES/TWINS/WGC007813/WGC007813_1.fq \
/home/liucj/data/SAMPLES/TWINS/WGC007813/WGC007813_2.fq \
/home/liucj/projects/TWINS/WGC007813/no.saitosam.out &
here is part of sam file http://postimg.org/image/yr3qt1ph9/
Can you post what version of bwa you're using and the commands used to generate the sai files? While I assume you don't have a bunch of special characters in the original fastq files, you might just double check to ensure they're not corrupt.
good point, look at a qualities in the fastq file,
above everything is really bad quality, perhaps you have an different encoding and once it passes through bwa it gets interpreted in a way that shifts these qualities to be beyond the normal scale, ^D is control character 4
thanks for your reply, I found the problem. the fastaq file was not corrupted. in producing SA coordinate process, i set quality score format as illumina 1.3+, but in fact, the score format of my fq files is illumina 1.8+. a stupid mistake.!!! anyway thanks a lot !!
I have only seen this when the fastq files are corrupted. Check the md5sums you have for the fastq files and the ones the sequencing center provided.
i checked the fastaq files, they was not corrupted. thanks a lot !