I am using galaxy to try to generate lists of genes (in .bed format) that do not flank each other within 2 kb. The strategy is to generate a list of flanks using FlankBed on a gene list downloaded from the UCSC table browser, then use SubtractBed to remove any overlaps between the original gene list and the FlankBed list. I have used this method with success using the UCSC gene list, but I would also like to try it using RefSeq genes.
I run into a problem with bed files generated by the UCSC table browser from RefSeq. I send the data to galaxy, where it loads successfully and appears in the correct format. I run FlankBed successfully (original file contains ~110,000 regions, FlankBed on original file contains ~220,000 regions, indicating it worked). However, when I run SubtractBed on the RefSeq gene list and the FlankBed generated gene list, I get the following error:
Fatal error: Exit code 1 ()
Error: Invalid record in file /galaxy-repl/main/files/019/102/dataset_19102006.dat. Record is
chr1 248936684 248931412 XR_949374.2 0 + 248936684 248936684 0 2 1660,103, 0,5169,
Has anyone run into this problem? I've successfully manipulated the UCSC gene list, should I just consider the RefSeq list a lost cause?
Thanks.