How can I query repeatmasker to give me the divergence for all the repeat families?
1
1
Entering edit mode
9.2 years ago

I want to get a file that shows me the basic parameters of each of the subfamilies in repeatmasker, as demonstrated here, but for all the subfamilies in a nice table so I can play with it.

I've used this before:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select chrom,chromStart,chromEnd,name from nestedRepeats'

but it doesn't give me the divergence. As an extension to the question is there a way guide to using mysql on repeatmasker?

Thanks in advance

repeatmasker • 2.5k views
ADD COMMENT
0
Entering edit mode
9.2 years ago
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select *,milliDiv/1000 as divergence from rmsk limit 10'

This will return the columns described here. The milliDiv column value divided by 1000 gives the divergence.

If you want to operate on all the repeats in the human genome, you are probably better off downloading the entire rmsk table as a tab-delimited text file rather than repeatedly querying for the entire table from MySQL. To answer your last question, if you are querying MySQL, you'll need to know some SQL, although SQL is pretty straightforward to learn, particularly on a single table.

ADD COMMENT
0
Entering edit mode

OK thanks. I guess I was asking how can I get the names of the tables and fields so I can perform the queries

ADD REPLY

Login before adding your answer.

Traffic: 2768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6