Augustus gene on forward and reverse strain
0
0
Entering edit mode
6.6 years ago
ciemanek ▴ 140

Hi all,

I am trying to identify gene of known sequence among set of coding sequences predicted with Augustus. I have two candidate genes, which have an identical sequences after I reverse compliment one of them (100% similarity) and they are on the opposite strands. I don't quite understand the concept of overlapping genes and reverse/forwards strands: if my candidate genes are reverse complimentary to each other and they're present on the opposite strands, does it mean that in fact it can be one gene, just Augustus prediction for forward and complimentary strain is overlapping for this gene?

Thanks a lot, Agata

gene prediction annotation augustus ngs • 1.8k views
ADD COMMENT
0
Entering edit mode

Just to clarify these two predictions are on different strands in two separate locations?

ADD REPLY
0
Entering edit mode

yes, they are within the same scaffold, to give you an overview:

g1 start:719963 end:721093 strand:-

g2 start:535611 end:536738 strand:+

Moreover, I am investigating a variation in the gene of interest. Looking from my alignment visualisation there is a variant in the corresponding positions on both of them (meaning that it's placed in the same distance from the start of one gene as the distance from the end of the other gene) and variants are complimentary to each other. Would it also support the theory that it's the same gene?

ADD REPLY
1
Entering edit mode

Chances of there being two copies of the gene seem to be small since they share a varation in the same relative position, while being physically apart. Is this a novel genome? Could the assembly be incorrect? Is the region around the genes similar?

ADD REPLY
0
Entering edit mode

Yes, the scaffolds were assembled but the genome is highly repetitive and assembly is quite challenging. So would you suggest that this is more of a result of assembly issues?

In fact, BLAST shown also two short 'genes' (228b) in a close distance to those with variation, which also have very high similarity but do the other end of the query genes. I was trying to extract the whole regions containing short+long gene+intergenic region between them for both: forward and reverse strand genes and those regions also are 100% identical, including intergenic region. Is it possible that it's a prediction error and in fact it's a one long gene but again due to the poor assembly there is a gap in prediction?

ADD REPLY

Login before adding your answer.

Traffic: 3001 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6