Entering edit mode
6.6 years ago
alongalor
•
0
Could someone refer me to where I could download the smallest possible .bam files to test my GATK Best Practices Variant Discovery pipeline? My pipeline uses the -L option to parallelize over different chromosomes - I would like to test this functionality and so I would like a full .bam file that has data from all chromosomes and will not cause GATK to crash.
Thanks a lot!
Just use any BAM that you have on disk, make a little BED file with one interval per chromosome, e.g. chr1-22 from 6000000-6100000 respectively, and use SAMtools view to get a subset of the whole BAM:
That should be sufficient for testing purposes.
This worked perfectly in terms of splitting the bam file but it caused GATK to crash with a strange error... my pipeline of course still works with the original bam file before it was split.