Biostar Beta. Not for public use.
Trouble with BioMart Perl API installation
0
Entering edit mode
2.4 years ago
tlorin • 250
Switzerland

Dear all, I am trying to install BioMart Perl API (in order to automatize orthologs retrieval in other species for some gene), and I have followed the instructions here.

And I'm almost done!!!

Yet, I have the following error when I try to run my example script:

perl apiExample17112015.pl

junk after document element at line 18, column 0, byte 465 at /home/tlorin/perl5/perlbrew/perls/perl-5.20.2/lib/site_perl/5.20.2/x86_64-linux-thread-multi/XML/Parser.pm line 187.

Here is my example script:

----------------------------------------

An example script demonstrating the use of BioMart API.

This perl API representation is only available for configuration versions >= 0.5

use strict;
use lib '/home/tlorin/biomart-perl/lib/';
use BioMart::Initializer;
use BioMart::Query;
use BioMart::QueryRunner;

my $confFile = "../conf/martURLLocation.xml";

NB: change action to 'clean' if you wish to start a fresh configuration

and to 'cached' if you want to skip configuration step on subsequent runs from the same registry

#

my $action='clean';
my $initializer = BioMart::Initializer->new('registryFile'=>$confFile, 'action'=>$action);
my $registry = $initializer->getRegistry;

my $query = BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');

    $query->setDataset("drerio_gene_ensembl");  
    $query->addFilter("ensembl_gene_id", ["ENSDARG00000003732"]);  
    $query->addAttribute("ensembl_gene_id");  
    $query->addAttribute("ensembl_transcript_id");  
    $query->addAttribute("pformosa_homolog_ensembl_gene");  
    $query->addAttribute("gmorhua_homolog_ensembl_gene");  
    $query->addAttribute("amexicanus_homolog_ensembl_gene");  
    $query->addAttribute("cintestinalis_homolog_ensembl_gene");  
    $query->addAttribute("lchalumnae_homolog_ensembl_gene");  
    $query->addAttribute("trubripes_homolog_ensembl_gene");

$query->formatter("TSV");

my $query_runner = BioMart::QueryRunner->new();

######################## GET COUNT

$query->count(1);

$query_runner->execute($query);

print $query_runner->getCount();

#
######################## GET RESULTS

to obtain unique rows only

$query_runner->uniqueRowsOnly(1);

$query_runner->execute($query);
$query_runner->printHeader();
$query_runner->printResults();
$query_runner->printFooter();

-------------------------------------

I must precise that I am not familiar with Perl and I have no idea of what could happen here...Any idea? Has someone already experienced that?

Thanks a lot for your time!

ADD COMMENTlink
1
Entering edit mode
2.4 years ago
tlorin • 250
Switzerland

I got it, thanks to @ Jean-Karim Heriche :

My xml file was :

---------

<!DOCTYPE MartRegistry>

<MartRegistry> <MartURLLocation name = "msd" displayName = "My BioMart Database" host = "www.biomart.org" port = "80" visible = "1" default = "" includeDatasets = "" martUser = "" > </MartRegistry> <MartRegistry> <MartURLLocation database="ensembl_mart_82" default="1" displayName="Ensembl Genes 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_ENSEMBL" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="sequence_mart_82" default="" displayName="Sequence" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_SEQUENCE" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="ontology_mart_82" default="" displayName="Ontology" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_ONTOLOGY" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="genomic_features_mart_82" default="" displayName="Genomic features 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_GENOMIC" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="snp_mart_82" default="" displayName="Ensembl Variation 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_SNP" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="regulation_mart_82" default="" displayName="Ensembl Regulation 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_FUNCGEN" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="vega_mart_82" default="" displayName="Vega 62" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_VEGA" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="pride_mart_1" default="1" displayName="PRIDE (EBI UK)" host="www.ebi.ac.uk" includeDatasets="" martUser="" name="pride" path="/pride/biomart/martservice" port="80" redirect="1" serverVirtualSchema="default" visible="1" /> </MartRegistry>

--------------

because I had pasted what was indicated on the Perl page (when you do a biomart search) at the end of the initial martURLLocation.xml file that was initially only:

-------

<!DOCTYPE MartRegistry>

<MartRegistry> <MartURLLocation name = "msd" displayName = "My BioMart Database" host = "www.biomart.org" port = "80" visible = "1" default = "" includeDatasets = "" martUser = "" > </MartRegistry>

-------

So I just deleted this first part and it's now working!

-----------

<!DOCTYPE MartRegistry>

<MartRegistry> <MartURLLocation database="ensembl_mart_82" default="1" displayName="Ensembl Genes 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_ENSEMBL" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="sequence_mart_82" default="" displayName="Sequence" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_SEQUENCE" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="ontology_mart_82" default="" displayName="Ontology" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_ONTOLOGY" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="genomic_features_mart_82" default="" displayName="Genomic features 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_GENOMIC" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="" /> <MartURLLocation database="snp_mart_82" default="" displayName="Ensembl Variation 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_SNP" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="regulation_mart_82" default="" displayName="Ensembl Regulation 82" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_FUNCGEN" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="vega_mart_82" default="" displayName="Vega 62" host="www.ensembl.org" includeDatasets="" martUser="" name="ENSEMBL_MART_VEGA" path="/biomart/martservice" port="80" serverVirtualSchema="default" visible="1" /> <MartURLLocation database="pride_mart_1" default="1" displayName="PRIDE (EBI UK)" host="www.ebi.ac.uk" includeDatasets="" martUser="" name="pride" path="/pride/biomart/martservice" port="80" redirect="1" serverVirtualSchema="default" visible="1" /> </MartRegistry>

--------------

Thanks again :D

ADD COMMENTlink
1
Entering edit mode
17 months ago
EMBL Heidelberg, Germany

As the error message suggests, you have a malformed XML document. My guess is the culprit is martURLLocation.xml. Check that it is valid XML.

Edit: Just checked what EnsEMBL gives as martURLLocation.xml. There's a white line at the top, this usually causes XML parsers to throw an error.

ADD COMMENTlink
0
Entering edit mode

See my answer below, it was indeed a problem with this file but not due to initial blank line. Thanks :D

ADD REPLYlink
0
Entering edit mode
19 months ago
mcclintock • 10
China/Wuhan/HUST

Processing Cached Registry: /share/home/software/biomart-perl/conf/cachedRegistries/martURLLocation.xml.cached

Problems with the web server: 302 Found

Gene stable ID Transcript stable ID HGNC symbol
UniProtKB/Swiss-Prot ID

why I got this with nearly the same code?

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1