FASTQ Generation Difference between BaseSpace and BCL2FASTQ-v2
2
0
Entering edit mode
6.2 years ago

Hi there.

I did two study. One of them started with fastq files which i downloaded from Illumina BaseSpace, another one is produced from BaseCalls with bcl2fastq2 program. But interestingly, there is some size difference between fastq files which their origin is different.

Is there any reason for that? Currently I'm working on a Diagnostic center so this is so important for me. Rush answers will be great.

illumina basespace bcl2fastq2 fastq • 3.4k views
ADD COMMENT
0
Entering edit mode
6.2 years ago
GenoMax 141k

File size is never a good indicator of similarity. Depending on storage architecture the same file may be of different sizes on different storage devices due to differences in sector sizes etc.

Have you looked at the counts of reads/total number of bases in the two datsets assuming they are otherwise identical? Keep in mind that BaseSpace may be trimming your data automatically where as standalone bcl2fastq can be setup not to do that by default.

ADD COMMENT
0
Entering edit mode

Yes i've check number of lines and bases. Still there is a difference. For some samples, BaseSpace data is larger, for another, bcl2fastq-v2 generated data is larger. Not just size, number of reads and bases are different also

ADD REPLY
0
Entering edit mode

Are these identical samples being processed locally via bcl2fastq and also BaseSpace? Start looking at the scan/trim settings for both methods. While it should not make a difference in theory, are you using the latest bcl2fastq (v.2.20) locally?

ADD REPLY
0
Entering edit mode

huseyin@tani-merkezi:~$ bcl2fastq --version BCL to FASTQ file converter bcl2fastq v2.20.0.422 Copyright (c) 2007-2017 Illumina, Inc.

yes version is latest.

ADD REPLY
0
Entering edit mode

What about settings? Are you using "fastq only" for bcl2fastq in your samplesheets? Same setting for BaseSpace? What is the run configuration (cycles x cycles, index)?

If yes, then you are going to have to start digging into the files to see where the differences are.

ADD REPLY
0
Entering edit mode
6.2 years ago

For starters, you need to provide the command line used on bcl2fastq, and see if you can find the settings used when BaseSpace made the fastqs. One obvious thing, while the default compression level in bcl2fastq is 4, it could be set to anything 1-9. This could make the files appear bigger, even if they contain the same amount of info. I understand that this does not explain the whole discrepancy in your case. You might also check to see if one or the other included reads that did not pass filters. bcl2fastq by default will not include these, but perhaps it was run to include these.

ADD COMMENT

Login before adding your answer.

Traffic: 1977 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6