Convert list of ensembl Human genes -GRCh38.p13- from biomart to hg38 coordinates
0
0
Entering edit mode
3.7 years ago
Alewa ▴ 150

Hi all,

i download few ensensemble genes with cordinates (Human genes (GRCh38.p13)) from biomart and want to convert to ucsc hg38 cordinates. there are some contigs not in hg38 ref dictionary

see this example (full file below); CHR_HSCHR5_1_CTG1_1 70376388 70411404 ENSG00000276910 GTF2H2

i didn't see the option on hgliftOver https://genome.ucsc.edu/cgi-bin/hgLiftOver

any suggestions on how to accomplish this?

thanks Sam

5   87318416    87412930    CCNH    ENSG00000134480 q14.3
19  4090321 4124122 MAP2K2  ENSG00000126934 p13.3
17  28346633    28357527    POLDIP2 ENSG00000004142 q11.2
CHR_HSCHR6_MHC_MANN_CTG1    30952827    30958749        ENSG00000233149 GTF2H4
15  80152490    80186946    FAH ENSG00000103876 q25.1
2   216107464   216206303   XRCC5   ENSG00000079246 q35
1   241497603   241519755   FH  ENSG00000091483 q43
5   157142933   157255185   ITK ENSG00000113263 q33.3
6   111299028   111483715   REV3L   ENSG00000009413 q21
10  70597348    70602759    PRF1    ENSG00000180644 q22.1
CHR_HSCHR5_1_CTG1_1 70376388    70411404        ENSG00000276910 GTF2H2
Ensembl genome hg38 assembly liftover • 1.6k views
ADD COMMENT
0
Entering edit mode

Are these genes specifically in patches? If so they may not have been present in original hg38 release. Reference sequence (as released by GRC) used by all annotators should be identical. CCNH gene is in UCSC

ADD REPLY
0
Entering edit mode

thanks @genomax for the explanation. yes it seems these genes are also in the patches. what is usually done in the community? these are list of DNA repair genes which i want to the intersect with my .VCFs any suggestions on how to go about this correctly?

ADD REPLY
0
Entering edit mode

Sorry if I was not clear. These genes do not seem to be ONLY in patches i.e. they were present in original assembly release. UCSC and Ensembl may annotate genes differently. For example UCSC has many entries for CCNH gene (and one of those overlaps the one you have in your list though the stop nucleotide in UCSC is different)

CCNH (ENST00000256897.9) at chr5:87394274-87412930 - Homo sapiens cyclin H (CCNH), transcript variant 4, mRNA. (from RefSeq NM_001364075)
CCNH (ENST00000651575.1) at chr5:87392860-87412794 - Homo sapiens cyclin H (CCNH), transcript variant 9, non-coding RNA. (from RefSeq NR_157071)
CCNH (ENST00000646883.1) at chr5:87376253-87399451 - cyclin H (from HGNC CCNH)

**CCNH (ENST00000645953.1) at chr5:87318416-87412908 - Homo sapiens cyclin H (CCNH), transcript variant 8, non-coding RNA. (from RefSeq NR_157070)**

CCNH (ENST00000607486.1) at chr5:87376256-87377180 - cyclin H (from HGNC CCNH)
CCNH (ENST00000513499.5) at chr5:87404842-87412889 - cyclin H (from HGNC CCNH)
CCNH (ENST00000511207.5) at chr5:87394280-87401774 - cyclin H (from HGNC CCNH)
CCNH (ENST00000510921.5) at chr5:87392707-87404950 - cyclin H (from HGNC CCNH)
CCNH (ENST00000510020.5) at chr5:87408038-87412871 - cyclin H (from HGNC CCNH)
CCNH (ENST00000508855.5) at chr5:87393417-87411294 - Belongs to the cyclin family. (from UniProt D6RG18)
CCNH (ENST00000505587.5) at chr5:87391494-87408180 - cyclin H (from HGNC CCNH)
CCNH (ENST00000505230.1) at chr5:87409073-87412850 - cyclin H (from HGNC CCNH)
CCNH (ENST00000504878.1) at chr5:87394274-87412904 - Homo sapiens cyclin H (CCNH), transcript variant 2, mRNA. (from RefSeq NM_001199189)
CCNH (ENST00000504115.1) at chr5:87394326-87408186 - cyclin H (from HGNC CCNH)
ADD REPLY

Login before adding your answer.

Traffic: 2583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6