Hi all and sorry if my question has already been resolved.
I am going to do all the bio info processing of my WGS for the first time so I obviously have some doubts, especially since it is sometimes difficult to know if what we did worked correctly or not.
for each individual I have 8 fastq files. it's paired end so there are 4 R1 and 4 R2 files.
I would like to know when and how should I merge them? if i merge the fastq i can do a cat file 1 2 3 4> newfile ?
but if I merge the fastq to make a single R1 file and a single R2 file per individual is that information from which read is the mate of the other will be kept? because I would need this information for analyzes of structural variants.
if i merge the fastq i can do a cat file 1 2 3 4> newfile ? but if I merge the fastq to make a single R1 file and a single R2 file per individual is that information from which read is the mate of the other will be kept?
concatenate the paired files in the very same order
I have 592 fastq files (74 individuals * 4 lanes * 2 ).
so your recommendation is to leave those 592 files separate, do the alignment and merge the bams to get 74 bam?
Because originally I intended to make 74 * 2 = 148 fastq files and start from there to do quality control, alignments.
Thanks for your reply
I have 592 fastq files (74 individuals * 4 lanes * 2 ). so your recommendation is to leave those 592 files separate, do the alignment and merge the bams to get 74 bam? Because originally I intended to make 74 * 2 = 148 fastq files and start from there to do quality control, alignments.
yes
you can always do that for QC . But anyway, if the fastqs come from 4 lanes, one want to get a QC per lane/sample.