Question

How to analyse single reads (not paired-end) for 16S diversity study

0

Entering edit mode

4.9 years ago

rehab1171 • 0

I am very new (first time) user of Galaxy. I am trying to analyze Illumina Fastq files for 16S diversity study. Unfortunately, my data quality is bad so I can not create contigs from forward and reverse reads (I did 2x300 bp, the last 150 from each read has a low quality, on generating a contigs I get 600 bp long instead of 300 bp, so I need to trim to 150 bp and run single read, right). How do I combine just my read 1 fastq files, retaining the sample name for all the sequences in each file of course, so that I can proceed with the following alignment and classification steps? There was a post long ago on this. I followed it and still doesn't work. The post suggested using Fastq.info then to combine fasta files (i did using concatenate files) and to create a group file. Later on to use both combined fasta file s and the group file in the downstream analysis in the tutorial pipeline. but it didn't work. Any help, please.

sequence • 1.1k views

ADD COMMENT • link 4.9 years ago by rehab1171 • 0

0

Entering edit mode

as a suggestion, you can use seqtkhttps://github.com/lh3/seqtk mergepe to merge fq1 and fq2.Usage: seqtk mergepe <in1.fq> <in2.fq>. Then,use trimmomatic to trim adapter and dirty sequence using single read mode.http://www.usadellab.org/cms/?page=trimmomatic.

ADD REPLY • link 4.9 years ago by Leon ▴ 130

score 0 · Answer 1 · 2019-05-25

Thank you Leon for taking the time to answer. I understand that there is no direct way on Galaxy to run single reads, correct? One has to run other programs (the links you provided). I know, I am asking very basic questions, my apology, one is always a beginner when is running thing first time :) Thanks RE