Trying To Pull Regions From An Online Bam File Without Downloading
3
4
Entering edit mode
11.8 years ago
Wjeck ▴ 490

I am trying to pull regions from a bunch of BAM files on an online server. I'd like to pull the reads mapping to a certain 1kb or so chunk and download them for analysis. They are far too massive to download them all, and it's impractical even to wget them one at a time and pull the regions out using samtools (tried it, and it worked, but it took forever). Since I'll have to do this for a number of regions that I won't know in advance, I need a better way.

I noticed that samtools is capable of running 'samtools view' off of a web address. Sadly, this data is protected behind an https server, which samtools doesn't know how to handle. I notice that IGV is able to read the BAM files of the net by asking for my login and querying specific regions only that I bring up to view, but I don't have a way of automating the process on hundreds of files.

Does anyone have any ideas of how to run something like samtools view on specific regions over an https connection?

samtools igv bam • 5.6k views
ADD COMMENT
2
Entering edit mode

did you try to put your password in the url ? e.g: "https://userid:password@anywhere.org/bams/my.bam"

ADD REPLY
1
Entering edit mode

This does work for me in my case, thanks Pierre

ADD REPLY
0
Entering edit mode

Doesn't seem to work for me. I am not sure that samtools recognizes https is a web address. The response I get is:

open: No such file or directory
[main_samview] fail to open "https://uname:*******@website/my.bam" for reading.

(with my website, password etc, of course)

ADD REPLY
5
Entering edit mode
11.8 years ago

samtools, and other direct-BAM-access programs like IGV, are capable of opening local files as well as remotely and publicly served files. these remote locations include ftp and http protocols, but unfortunately do not include encrypted transfer protocols such as scp, ssh nor https. it's not a matter of authentication, which is solved on http and ftp, but of handling data encryption which is far more complicated. the only way you may work with all that BAM files you are interested in is either asking the server managers to open them through http, or either downloading them all and dealing with them locally.

ADD COMMENT
1
Entering edit mode

Just a note, contrary to this post IGV does seem to be able to handle https and the associated encryption, but samtools does not.

ADD REPLY
0
Entering edit mode

good to know that. thanks for the information.

ADD REPLY
2
Entering edit mode
10.0 years ago

Possibly can be done through curl. In the following command, --negotiate option enables SPNEGO in curl. The -u option is required but the user name is ignored. The -b and -c options are used to store and send HTTP cookies. The -s is to silence curlchucking out status. (Try typing the command as it is as the bam file exists)

curl --negotiate -u : -b ~/cookienumnumnum.txt -c ~/cookienumnumnum.txt -s http://gasv.googlecode.com/files/Example.bam | samtools view -h - | head

will give

@SQ    SN:chr17    LN:78774742
chr17_15_201_1:0:0_1:0:0_209a6    163    chr17    15    60    50M    =    152    187    GTTCCTGCATAGATAATTGCATGACAATTGCCTTGTCCCTCCTGAATGTG    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:23    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:40G9
chr17_45_242_0:0:0_1:0:0_50f5f    99    chr17    45    60    50M    =    193    198    CCTTGTCCCTGCTGAATGTGCTCTGGGGTCTCTGGGGTCTCACCCACGAC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:0    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:0    XO:i:0    XG:i:0    MD:Z:50
chr17_123_290_3:0:0_1:0:0_27b3b    163    chr17    123    60    50M    =    241    168    ATAACAAACATATGTCCAGCGAATACCTGCATCCCTAGAAGTGAAGCGAC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:3    SM:i:25    AM:i:25    X0:i:1    X1:i:0    XM:i:3    XO:i:0    XG:i:0    MD:Z:0T10C35C2
chr17_15_201_1:0:0_1:0:0_209a6    83    chr17    152    60    50M    =    15    -187    CATCCCTAGAAGTGAAGCCACCGCCCAAAGACACGCCCATATCCAGCTTA    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:23    AM:i:23    X0:i:1    X1:i:1    XM:i:1    XO:i:0    XG:i:0    MD:Z:40G9
chr17_164_380_1:0:0_0:0:0_aa5e4    99    chr17    164    60    50M    =    331    217    TGAAGCCACCGCCCAATGACACGCCCATGTCCAGCTTAACCTGCATCCCT    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:16A33
chr17_45_242_0:0:0_1:0:0_50f5f    147    chr17    193    60    50M    =    45    -198    TCCAGCTTAACCTGCATCCCTAGAAGGGAAGGCACCGCCCAAAGACACGC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:26T23
chr17_204_401_0:0:0_0:0:0_43ea1    99    chr17    204    60    50M    =    352    198    CTGCATCCCTAGAAGTGAAGGCACCGCCCAAAGACACGCCCATGTCCAGC    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:0    SM:i:23    AM:i:23    X0:i:1    X1:i:1    XM:i:0    XO:i:0    XG:i:0    MD:Z:50
chr17_224_415_1:0:0_1:0:0_a2d53    163    chr17    224    60    50M    =    366    192    GCACCGCCCAAAGACACGCCCATGTCCAGCTTATTCTCCCCAGTTCCTCT    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:37    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:37G12
chr17_123_290_3:0:0_1:0:0_27b3b    83    chr17    241    60    50M    =    123    -168    GCCCATGTCCAGCTTATTCTGCCCAGTTCCTCTCCAGATAGGCTGCATGG    22222222222222222222222222222222222222222222222222    XT:A:U    NM:i:1    SM:i:37    AM:i:25    X0:i:1    X1:i:0    XM:i:1    XO:i:0    XG:i:0    MD:Z:38A11

Best Wishes,
Umer

ADD COMMENT
2
Entering edit mode
4.1 years ago

Old thread but I found a solution using ssh. Can be usefull to other ;)

Requirements :

  • samtools should be installed on the distant server

  • distant bam file must be indexed i.e. samtools index

  • path of the bam file should be known on the distant server

Solution (change user and distantServer to your server information)

ssh user@distantServer 'samtools view /path/to/file.bam chr5:10000-20000'
ADD COMMENT

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6