I have same samples in multiple lanes. What are all the steps to be taken before downstream analysis?
1
3
Entering edit mode
8.6 years ago
nalandaatmi ▴ 100

Dear All,

I need some clarifications for the below scenario (RNASeq experiment),

  • lane 1 in flow cell contains 4 samples (A, B, C, D) those had been loaded on the lane 2 (A, B) and lane 3 (C, D) as well.
  • Before performing downstream analysis, do I need to merge the FASTQ reads from lane 1 with FASTQ reads in lane 2 (similarly for reads from lane 1 with reads in lane 3)?
  • Without merging the fastq files, if I do the alignment separately for lane 1, lane 2, and lane 3 with human reference. Does it impact my analysis? then I am planning to merge the two bam files from lane 1 and lane 2 (similarly for lane 1 and lane 3) using sam tools.
fastq alignment next-gen-sequencing RNASeq • 13k views
ADD COMMENT
0
Entering edit mode

is your data barcoded? multiplexed?

ADD REPLY
0
Entering edit mode

Dear Pelin, Yes it is bardcoded and dumultiplexed. Please find below the necessary details

FCID,Lane,SampleID,SampleRef,Index,Descriptior,Control,Recipe,Operator,SampleProject
D0JR0ACXX,1,A,Human,ATCACG,,N,R1,RS,RNAseq
D0JR0ACXX,1,B,Human,CGATGT,,N,R1,RS,RNAseq
D0JR0ACXX,1,C,Human,ACAGTG,,N,R1,RS,RNAseq
D0JR0ACXX,1,D,Human,GCCAAT,,N,R1,RS,RNAseq

D0JR0ACXX,2,A,Human,ATCACG,,N,R1,RS,RNAseq
D0JR0ACXX,2,B,Human,CGATGT,,N,R1,RS,RNAseq

D0JR0ACXX,3,C,Human,ACAGTG,,N,R1,RS,RNAseq
D0JR0ACXX,3,D,Human,GCCAAT,,N,R1,RS,RNAseq
ADD REPLY
7
Entering edit mode
8.6 years ago

I'm assuming that only a single library was made from each sample and then split on multiple lanes (in a somewhat weird way, I might add). If multiple libraries were made then you will need to give further details.

It doesn't matter much if you concatenate the fastq files before alignment or merge the BAM files afterwards and you should get essentially the same results either way ("essentially" because there's always some randomness to alignment).

ADD COMMENT
0
Entering edit mode

Dear Devon Ryan,

They made a single library and then loaded into different lanes.

Actually two projects samples were loaded in the flow cell. After loading, one of the lane was empty in the flowcell, so instead of leaving it blank they loaded all the 4 samples in that lane1.

For project A they loaded samples in following lanes (1,2,3) and for project B in the following lanes (4,5,6,7,8).

ADD REPLY
0
Entering edit mode

Cool, they can be concatenated at any point then.

ADD REPLY
0
Entering edit mode

Thanks Devon Ryan.

Instead of merging at the fastq level. I am going to do with the bam files. I received accepted_hits.bam file for each sample as an output after running Tophat (which uses bowtie2) command.

With those bam files, I am planning to merge them using the steps mentioned in this post. In this link, they are merging sam. So I am going to convert my bam to sam and then sort the sam and finally merge the sam files. Then I am going to use these sam or bam files for cufflinks step. Correct me if I am wrong.

ADD REPLY
0
Entering edit mode

I would merge directly the bam files and make sure to remove duplicates (by using either "rmdup" from samtools or MergeBamAlignment from picard tools).

ADD REPLY
1
Entering edit mode

Duplicates should typically not be marked or removed from RNAseq data. For highly expressed genes you end up capping your signal.

ADD REPLY
0
Entering edit mode

Hi Devon. I have a similar question as Nalandaatmi. The only difference is that I have a sample that was sequenced twice using different adapters. I have trimmed and normalized reads for both libraries and generated 2 bam files for this sample. Can I use samtools merge to merge the 2 bam files or should I use Picard? Thanks for your help!

ADD REPLY

Login before adding your answer.

Traffic: 1680 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6