bedGraphToBigWig error - end coordinate bigger than chr
3
0
Entering edit mode
6.8 years ago
varsha619 ▴ 90

Does anyone know a way to fix the bedGraphToBigWig error - end coordinate bigger than chr? My input is a bedGraph generated using MACS2. This link suggests using bedClip - https://groups.google.com/forum/embed/#!topic/macs-announcement/gXdf115Xy5Q. But I would like to know if there is a command line option to fix it. Thank you for your help.

bedGraphToBigWig macs2 • 4.3k views
ADD COMMENT
0
Entering edit mode

Fixed it with bedClip, thank you for your help!

ADD REPLY
1
Entering edit mode
6.8 years ago
genecats.ucsc ▴ 580

bedClip is a command line program to do this:

bedClip input.bed http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.chrom.sizes output.bed

You can download bedClip from the directory appropriate to your operating system within our directory of utilities.

If you have questions about running bedClip, feel free to send a question to one of our mailing lists:

  • genome@soe.ucsc.edu for general questions
  • genome-mirror@soe.ucsc.edu for questions involving mirrors or gbibs
  • genome-www@soe.ucsc.edu for questions involving private data

ChrisL from the UCSC Genome Browser

ADD COMMENT
0
Entering edit mode
6.8 years ago

generate a awk script that will clip your bed records:

mysql --user=genome -N --host=genome-mysql.cse.ucsc.edu -A -D hg19  -e 'select chrom,size from chromInfo '  |\
awk '{printf("($1==\"%s\") {L=%d;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf(\"%s\\t%%d\\t%%d\\n\",B,E);next;}\n",$1,$2,$1);}' > script.awk



$ head  script.awk
($1=="chr1") {L=249250621;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr1\t%d\t%d\n",B,E);next;}
($1=="chr2") {L=243199373;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr2\t%d\t%d\n",B,E);next;}
($1=="chr3") {L=198022430;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr3\t%d\t%d\n",B,E);next;}
($1=="chr4") {L=191154276;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr4\t%d\t%d\n",B,E);next;}
($1=="chr5") {L=180915260;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr5\t%d\t%d\n",B,E);next;}
($1=="chr6") {L=171115067;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr6\t%d\t%d\n",B,E);next;}
($1=="chr7") {L=159138663;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr7\t%d\t%d\n",B,E);next;}
($1=="chrX") {L=155270560;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chrX\t%d\t%d\n",B,E);next;}
($1=="chr8") {L=146364022;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr8\t%d\t%d\n",B,E);next;}
($1=="chr9") {L=141213431;B=int($2);E=int($3);B=(B>=L?L:B);E=(E>=L?L:E);printf("chr9\t%d\t%d\n",B,E);next;}

then use this awk script :

awk -f  script.awk input.bed
ADD COMMENT
0
Entering edit mode
6.8 years ago

To solve this problem more generically, make a BED file and use that as a mask with BEDOPS bedops --element-of:

$ fetchChromSizes hg38 | awk '{ print $1"\t0\t"$2; }" | sort-bed - > hg38.bed
$ bedops --element-of 1 in.bedGraph hg38.bed > masked.in.bedGraph

Then convert the masked bedGraph file to Wiggle format.

But mainly I'd be concerned about having signal get generated in regions that don't or shouldn't exist. That might point to a potential data problem or code smell, somewhere.

ADD COMMENT

Login before adding your answer.

Traffic: 2283 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6