I am working on human cancer data and wanted to extract sequences from the human genome for analyzing SNPs. For that, I am trying to fetch exon start and end positions for the human genome build GRCh37. I used ucsc genome browser to download this data (using known Gene table)
The data has the same values for cds start and cds end. I am confused as to why this is the case, as cds is for coding region. Should I ignore this and go use exon start and end positions?
Any help is highly appreciated!
Thanks!
Please post example of "
The data has the same values for cds start and cds end".
Here is the example of one entry:
If you look at the 6th and 7th columns (cdsStart and cdsEnd), the values are the same. I am confused as to why that's the case.