Biostar Beta. Not for public use.
Parsing a FASTQ File
0
Entering edit mode
16 months ago
mrsmith • 10

I am relatively new to the field, and I could desperately use some help.

I am trying to process a FASTQ File using DADA2, but I really would like to separate all of the forward and reverse reads for each sample out of a very large FASTQ file. The file was initially large FASTA file, and I have already trimmed the file to remove the primers and barcodes using qiime1 , and I still have the mapping file. I then converted the file using qiime1 from a fasta to a fastq, but I'm really at a loss as to what I should do next.

dada2 • 501 views
ADD COMMENTlink
1
Entering edit mode

I do not understand either. How can a file originally have been a fasta file, and then a fastq file? Where do the quality encodings come from? But if you simply have a fastq files (paired-end) with both reads in the same file (you call that interleaved), aiming to deinterleave into two separate files, here are some inspirations.

ADD REPLYlink
0
Entering edit mode

I am sorry but the question is not clear to me. What do you want to achieve?

And are you talking about demultiplexing?

ADD REPLYlink
0
Entering edit mode

Qiime1 has a script, split_sequence_file_on_sample_ids.py, which will separate fastq or fasta files demultiplexed using split_libraries.py, into separate files for each sample. But this will not separate forward reads from reverse reads, if your forward and reverse reads are all in one file.

ADD REPLYlink
1
Entering edit mode
19 months ago
swbarnes2 5.7k
United States

Converting a fastq to a fasta results in a total loss of the quality scores. You are going to need the original quality scores to call variants.

So stop playing around with fastas, and get the original fastqs. The originals will also have read1 and read2 separate.

ADD COMMENTlink
0
Entering edit mode
19 months ago
National Centre for Cell Science, Pune

Some points to be cleared first:

  1. If you have single FASTQ files then your data is not paired-end.
  2. You are talking about seperating reads. Is it mean demultiplexing? i.e seperating reads of each sample. And DADA2 assume that you have demultiplexed FASTQ files.
  3. DADA2 need raw FASTQ files to detect variants.

For more information, please refer DADA2 tutorial

ADD COMMENTlink
1
Entering edit mode

If you have single FASTQ files then your data is not paired-end.

Interleaved FQ files do indeed exist. See my comment above.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1