Question

Identifying gene parts based on genomic regions

0

Entering edit mode

6.9 years ago

niutster ▴ 110

Hi

Does anybody know a tools that get the genomic regions such as chr3 345216:345250 and return its role in genome, for example promoter, enhancer, ...? UCSC genome browser do that but i want a tools.

gene-part genome genomic-location • 1.2k views

ADD COMMENT • link updated 10 months ago by Ram 43k • written 6.9 years ago by niutster ▴ 110

score 1 · Answer 1 · 2017-05-21

1

Entering edit mode

6.9 years ago

Alex Reynolds 35k

With BEDOPS, you can convert an annotation file to BED with convert2bed and then do a bedmap --echo-map operation on it to find overlaps.

For example, here's how to get a set of Gencode v21 annotations into BED format:

$ wget -qO- ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/release_21/gencode.v21.annotation.gff3.gz | gunzip -c - | convert2bed -i gff - > annotations.bed

And here's how to do an ad-hoc search of this file, for instance, looking for any annotations which overlap chr3:345216-345250:

$ echo -e 'chr3\t345216\t345250' | bedmap --echo --echo-map --delim '\t' - annotations.bed > answer.bed

If you have a BED file of intervals you want to query, you can specify it directly, instead of using echo -e:

$ bedmap --echo --echo-map --delim '\t' intervals.bed annotations.bed > answer.bed

In each case, you get the interval and any overlapping annotation (including that annotation's type: gene, etc.).

ADD COMMENT • link 6.9 years ago by Alex Reynolds 35k

0

Entering edit mode

I have run bedops but when i want to sort my file i get this error Non-numeric end coordinate seeline 1 in test.bed and it is the first of my file :

chr7    2480493 2480535
    chr5    3325150 3325561
    chr17   48970893    48971001
    chr5    67586433    67586928
    chr14   104623477   104623651
    chr18   20713759    20714217
    chr2    237702390   237702580
    chr10   82173535    82173570
    chr11   63852804    63853037
    chr10   105238964   105239205
    chr11   126197930   126198043
    chr1    3086487 3086509
    chr6    152127812   152128258
    chr9    92291210    92291269

what is wrong?

ADD REPLY • link 6.9 years ago by niutster ▴ 110

0

Entering edit mode

Run cat -te test.bed | head to make sure your file doesn't have extra tabs or non-Unix line endings. Then fix the file if it does. Then sort-bed the fixed file.

ADD REPLY • link 6.9 years ago by Alex Reynolds 35k