How To Get A List Of All Human Genes Above A Certain Length
1
2
Entering edit mode
11.0 years ago
Whetting ★ 1.6k

Hi,

I want to assemble a list of human genes longer than a specified length. Any ideas on how to accomplish such a feat?

Thanks!

human gene • 3.8k views
ADD COMMENT
4
Entering edit mode
11.0 years ago

Try this command :

$ mysql --user=genome -N --host=genome-mysql.cse.ucsc.edu -A -D hg19  -e "select name,name2,txEnd - txStart from refGene"  > Gene_sizes.txt

You can sort Gene_sizes file now using length that is in the third column. The table includes all the Refseq transcripts for a gene. You can get multiple information from UCSC. See the table schema: http://genome.ucsc.edu/cgi-bin/hgTables

ADD COMMENT
1
Entering edit mode

I was about to paste this: to limit above 1000 bases and dispaly top 20:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e ' select distinct name, chrom, txStart, txEnd, (txEnd - txStart) as length from refGene where (txEnd - txStart) > 1000 limit 20'
ADD REPLY
1
Entering edit mode

I'm curious if the OP wants this length (from transcription start to end) or the eventual length of the transcript generated (ie. the length after introns have been spliced out).

ADD REPLY
0
Entering edit mode

Question was not specific: there are few ways to get the information: start to end; sum all the exon length in a transcript etc.

ADD REPLY
2
Entering edit mode

Simplest just to sort Swiss-Prot Human by protein size ?

ADD REPLY

Login before adding your answer.

Traffic: 2758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6