Question

Detection of transcripts from mRNA and sRNA

0

Entering edit mode

6.2 years ago

kamel ▴ 70

Hello,

I have two fastq file of mRNA and sRNA sequenced from a patient infected with a pathogenic bacterium. My goal is to identify the transcripts of both the patient and the bacteria.

Is it logical to make a mapping with two reference sequences (one for the huamin genome and the other for the bacteria ??

Could you offer me a workflow to detect the trancrits??

Thank you in advance

RNA-Seq rna-seq alignment assembly • 1.3k views

ADD COMMENT • link updated 6.2 years ago by h.mon 35k • written 6.2 years ago by kamel ▴ 70

1

Entering edit mode

6.2 years ago

h.mon 35k

Yes, you can build a new reference combining both human and bacterial references. Another option is to use some tool to split the reads between the correct genomes. There are other tools, but I can recommend bbsplit.sh from the BBTools / BBMap package.

ADD COMMENT • link 6.2 years ago by h.mon 35k

0

Entering edit mode

Thanks h.mon for your help this is a tool I'm going to use, but I saw that it gives both .bam file for each alignment with reference, and that's the question I asked Devon. how can I count the transcripts for both organisms at once.

ADD REPLY • link 6.2 years ago by kamel ▴ 70

score 3 · Accepted Answer · 2018-02-17

3

Entering edit mode

6.2 years ago

Devon Ryan 104k

The ideal procedure would be to concatenate the two genomes and align to that.

ADD COMMENT • link 6.2 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you Devon Ryan for your reply. in this case I will get a single alignment file but how can I count the transcripts of both organisms by a single file. I can not find a good methodology for my case

ADD REPLY • link 6.2 years ago by kamel ▴ 70

2

Entering edit mode

They'll have quite different chromosome names, so either run featureCounts twice (once with each GTF files), or concatenate the GTF files and run featureCounts using that. I would suggest running featureCounts twice, since I expect you'll want to analyse the two organisms separately anyway.

ADD REPLY • link 6.2 years ago by Devon Ryan 104k

0

Entering edit mode

thank you Devon for your answer, so I will run twice featureCounts to analyze the two organizations separately. I just come back on the concatenation of the two references genomes, do you find this useful command : cat reference 1.fasta reference2.fasta> all_genomes.fasta (because for each reference sequence has different chromosomes)

ADD REPLY • link 6.2 years ago by kamel ▴ 70

1

Entering edit mode

Yes, just cat the files together, exactly like that.

ADD REPLY • link 6.2 years ago by Devon Ryan 104k

0

Entering edit mode

Hi Devon, I used featureocunts for quantization but its output is complicated. do you know how I can Generate an account matrix with featureCounts

ADD REPLY • link 6.1 years ago by kamel ▴ 70

0

Entering edit mode

Another question Plz. For sRNA mapping, I find somebody who selects reads from 18-30 and others from 18-26 before mapping. I want to know what size to select??

ADD REPLY • link 6.1 years ago by kamel ▴ 70

1

Entering edit mode

It depends on the type of RNA you're interested in. You size select for those types.

ADD REPLY • link 6.1 years ago by Devon Ryan 104k