Hello,
I have two fastq file of mRNA and sRNA sequenced from a patient infected with a pathogenic bacterium. My goal is to identify the transcripts of both the patient and the bacteria.
Is it logical to make a mapping with two reference sequences (one for the huamin genome and the other for the bacteria ??
Could you offer me a workflow to detect the trancrits??
Thank you in advance
Thank you Devon Ryan for your reply. in this case I will get a single alignment file but how can I count the transcripts of both organisms by a single file. I can not find a good methodology for my case
They'll have quite different chromosome names, so either run featureCounts twice (once with each GTF files), or concatenate the GTF files and run featureCounts using that. I would suggest running featureCounts twice, since I expect you'll want to analyse the two organisms separately anyway.
thank you Devon for your answer, so I will run twice featureCounts to analyze the two organizations separately. I just come back on the concatenation of the two references genomes, do you find this useful command : cat reference 1.fasta reference2.fasta> all_genomes.fasta (because for each reference sequence has different chromosomes)
Yes, just
cat
the files together, exactly like that.Hi Devon, I used featureocunts for quantization but its output is complicated. do you know how I can Generate an account matrix with featureCounts
Another question Plz. For sRNA mapping, I find somebody who selects reads from 18-30 and others from 18-26 before mapping. I want to know what size to select??
It depends on the type of RNA you're interested in. You size select for those types.