downloading bam files of phase 3 of 1000 genomes project
1
1
Entering edit mode
6.0 years ago
Ana ▴ 200

Hello everyone,

I am new in 1000 genomes project data. I want to download all bam files belonging to phase3, can anyone guide me how can I download all of them (from the command line?). Do you have any estimation how long it is going to take?

I want to compute the depth of coverage only for some specific intervals, not the entire genome. Is there any way that I could do it without downloading the data? I could find this, but not sure if it is relevant to what I want to do?

samtools view -b  ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/HG01375/alignment/HG01375.mapped.ILLUMINA.bwa.CLM.low_coverage.20120522.bam 2:1,000,000-2,000,000 | genomeCoverageBed -ibam stdin -bg > coverage.bg

I would appreciate if anyone could guide me.

1000genomes bam • 4.3k views
ADD COMMENT
0
Entering edit mode
ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/

data has moved

ADD REPLY
3
0
Entering edit mode

Hi Pierre, Thanks, so you mean I can use you command above without downloading the bam files? Can I also run it through loops for all of the bam files? there are 2504 individuals

ADD REPLY
0
Entering edit mode

Thanks, so you mean I can use you command above without downloading the bam files?*

yes, nevertheless the index is downloaded (*.bai)

Can I also run it through loops for all of the bam files? there are 2504 individuals

http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html

ADD REPLY
0
Entering edit mode

to download the data, I just directly typed in ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data/ but could not download it!

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

For the loop I am trying this, but still I get warning message "no such file or directory"

 for file in http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG*/alignment/*.bam;
    do /data/programs/samtools-1.3.1/samtools view -c "${file}" 2:1000000-2000000
    done

Am I doing something wrong here?

ADD REPLY
0
Entering edit mode

try:

 wget -q -O - "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/current.tree" | cut -f 1 | grep '.bam$' | while read B; do echo -n "$B " && ~/packages/samtools/samtools view -c "http://ftp.1000genomes.ebi.ac.uk/vol1/$B" "2:1000000-2000000"  && rm *.bam.bai ; done
ADD REPLY

Login before adding your answer.

Traffic: 2013 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6