Paired layout, but one fastq file
1
0
Entering edit mode
13 months ago
Andy ▴ 120

Good morning everyone,

I found data from GEO, GSE115469, and the author stated that it is in a paired fastq layout. However, I have only found one fastq file. I am wondering how to handle this fastq file because I need to re-run the cell ranger process.

Thanks Andy

fastq • 1.3k views
ADD COMMENT
0
Entering edit mode

If you're dealing with bulk data (technically possible even in single cell data), Paired End reads can also be interleaved and stored in a single FASTQ file. Always examine file content - that's part of the sanity check.

ADD REPLY
0
Entering edit mode

That would be unusual, since this is single-cell data.

ADD REPLY
1
Entering edit mode

OP's post does not mention single-cell, which is why I suggested this possibility. For future users that might run into this problems, I think it's important to understand that number of files may not be the most reliable indicator of SE/PE nature of sequencing.

ADD REPLY
1
Entering edit mode

GEO accession listed in original post is for single cell dataset.

ADD REPLY
0
Entering edit mode

I initially didn't bother looking up the GEO entry. I assumed it was bulk so the option did not cross my mind. I'll move my answer to a comment. I wish OP would have mentioned it - if I assumed bulk, so might other people that actually run into this problem in the bulk context.

ADD REPLY
0
Entering edit mode
13 months ago
Andy ▴ 120

I understand why now, the author only shared bam file.

ADD COMMENT
1
Entering edit mode

I've moved your comment to an answer and added an answer of my own. Please accept your answer and optionally mine too to mark your post as resolved.

ADD REPLY
1
Entering edit mode

You can recreate the fastq files using a tool provided by 10x genomics called bamtofastq (LINK). This will properly recreated the CB+UMI - R1 file and R2 RNA read file.

ADD REPLY
0
Entering edit mode

Yes, bamtofastq do solve the problem. And this gives me correct fastq files.

ADD REPLY
0
Entering edit mode

I am also having a similar issue where there is only one fastq file but no bam file available so from my understanding I can't use bamtofastq. Is there another solution? I have attached the link to the dataset I am trying to download: https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR14667226&display=metadata.

I have tried the following: fasterq-dump SRR14667226 --include-technical -S
fasterq-dump SRR14667226
fastq-dump SRR14667226 --split-3 --skip-technical

ADD REPLY
0
Entering edit mode

There is a BAM file available here: https://sra-pub-src-2.s3.amazonaws.com/SRR14667226/CTRL_possorted_genome_bam.bam.1

Please get that with curl/wget and convert using the 10x utility.

You need to look for the BAM files under Data Access tab (for above file: https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR14667226&display=metadata ). They may not always be available. 10x data submission at SRA can be a hit and miss thing.

ADD REPLY

Login before adding your answer.

Traffic: 1757 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6