I'm relatively new to bioinformatics and programming and need some help running a script that I'm having problems with.
I'm trying to replace sequence IDs in a treefile (newick format) with organism names, the sequence IDs with corresponding organism data are stored in a separate tab delimited file. I found the solution to this problem in a previous post using the following script How To Replace Sequence Id'S In A Text (Tree) File With Taxonomy Strings From A Corresponding Tab Delimited Taxonomy File
use strict;
use warnings;
my $treeFile = pop;
my %taxonomy = map { /(\S+)\s+(.+)/; $1 => $2 } <>;
push @ARGV, $treeFile;
while ( my $line = <> ) {
$line =~ s/\b$_\b/$taxonomy{$_}/g for keys %taxonomy;
print $line;
}
The script wont run the whole way through however. After adding in print "StepX\n";
throughout the script, I found it gets stuck at the my %taxonomy = map { /(\S+)\s+(.+)/; $1 => $2 } <>;
line.
I've gone back and reformatted the tab file to mirror the format of the tab file in the post, but the script still gets stuck at the same spot. The treefile and taxonomy are in stored in the local file that perl runs from. Is there something simple I have missed to cause the script to fail? Perhaps with defining the directory or filename. The script seems to have worked for others but I'm at a loss as to why it isn't working for me.
Thanks in advance.
For reference my tsv and treefiles look like this.
WP010933552.1 Chlorobaculum;__tepidum
WP011361294.1 Chlorobium;__chlorochromatii
WP006366269.1 Chlorobium;__ferrooxidans
WP012466994.1 Chlorobium;__limicola
WP011745973.1 Chlorobium;__phaeobacteroides
WP012498899.1 Chloroherpeton;__thalassium
WP012509156.1 Pelodictyon;__phaeoclathratiforme
WP012506474.1 Prosthecochloris;__aestuarii
WP014433201.1 Caldilinea;__aerophila
WP013218375.1 Dehalogenimonas;__lykanthroporepellens
and
(WP014737726.1:1.8525851341,((((((((WP027358538.1:0.3690143012,((WP004512544.1:0.1039871466,WP014551809.1:0.0491224567)100:0.3547853057,(WP012469611.1:0.4207143406,(WP012532312.1:0.1128063030,WP015839165.1:0.0280057978)100:0.3010198201)99:0.0913692177)79:0.0631776135)100:1.1860644760,(((((WP005505390.1:0.2412841810,WP027856531.1:0.1524078700)100:0.1071877723, etc.