Is there a "default" RefSeq transcript for genes?
15 months ago

I'm working on cancer genes and I have a question about the RefSeq transcripts.

As you may know, a gene can have several transcripts due to alternate splicing. I have seen on the Pecan Saint-Jude website that for genes with several transcripts, they have a representation of the "default" transcript, but I don't understand how they choose that. When you go on NCBI, they is no obvious way to tell if one transcript is better or more common than another one.

Do you guys know if there is a way to determine the "default" RefSeq transcript if anyhow it exists?

Thank you

MANE is a new joint project from NCBI/EMBL-EBI to address this specific question. A beta version of data is now available.

We’re leveraging public deep sequencing datasets to optimize 5’ and 3’ UTR endpoints to more accurately reflect transcriptional processes. To pick representative transcripts, we’ve developed computational methods to evaluate and integrate transcript expression levels, protein conservation, support from archived transcript submissions, clinical relevance, and other factors. Complex genes are subject to review by annotation experts from both groups to agree on a representative transcript and often make improvements to both annotation sets.

Assembly and maintenance process for RefSeq records was published in this handbook in section "how data are assembled and maintained".

Thank you very much, very interesting project!


