Recognizing an organism which is combination of other organisms
0
0
Entering edit mode
6.8 years ago

I have RNA-seq of an organism which is a combination of some other organism. I don't know the exact genome and all I have is transcriptome of genes. Now I want to recognize the mentioned organism is made of what other organisms. I used Bridger and Trinity to assemble the genes, but there are more that 800 genes in the result file, I tried to use blast to find the similarity between my gene results and ncbi dataset, but the results is not exact and there are lots of results with high similarity. Is there any way I can do that? Thanks

RNA-Seq rna-seq Assembly blast • 1.9k views
ADD COMMENT
1
Entering edit mode

What does it mean

"organism which is combination of other organisms"

Are you talking about metatranscriptome?

ADD REPLY
0
Entering edit mode

I mean that the new genome is made up of part of other genomes, then the mrna of genes is measured and RNA-seq data is available now. I ued trinity and bridger to assemble genes and what I've got is a file contains 800 genes.

ADD REPLY
0
Entering edit mode

I am sorry, but I cannot really understand. You have an organism (bacterium, yest, human) you take samples and extract the mRNA. Unless there are some contaminations the:

genome is made up of part of other genomes

doesn't make sense. If you do not have only one organism but a mix (e.g. a mix of bacteria or human and bacteria) than you have a metatranscriptome. In any way you should have a sort of idea about what you have in your samples.

ADD REPLY
0
Entering edit mode

See also OPs other question: Finding gene names of a fasta file contains gene sequences (which is the same problem I guess)

ADD REPLY
1
Entering edit mode

I suggest you try BBSketch, which will work on the reads or assembly:

sendsketch.sh in=transcriptome.fa nt

or

sendsketch.sh in=reads.fq reads=1m nt

It will only take a few seconds, and give you a taxonomic breakdown of the species present (provided they are in NCBI's nt database; you can alternately use the flag "refseq" to query RefSeq).

ADD REPLY
0
Entering edit mode

organism which is a combination of some other organism

Are the rest contaminants? How come you only have 800 genes? I am not familiar with Bridger (is that somehow responsible for this smaller number).

ADD REPLY
0
Entering edit mode

Because the target genome is made up of only necessary parts of other genomes. No I think the work of bridger is ok (since I tested it with trinity and the results are so close).

ADD REPLY
0
Entering edit mode

Sounds to me like a problem for phylogenetics .. don't know how much it will help you, but have tried clustering by conservation information?

ADD REPLY
0
Entering edit mode

What do you mean by clustering? I don't know anything about the new genomes and conservation information in it.

ADD REPLY
0
Entering edit mode

I don't know anything about the new genomes and conservation information in it.

Then how did you say this above?

Because the target genome is made up of only necessary parts of other genomes

Because a bioinformatics program ran and produced an output, it does not make the output always right.

We have over 10+ comments in this thread and it is not still clear about what type of an experiment this is and the rationale you are using for analyzing this data. Your best bet for now is to try @Brian's suggestion.

ADD REPLY

Login before adding your answer.

Traffic: 2440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6