Biostar Beta. Not for public use.
Can I use HOMER to create a tag dir from a 400 Gb BAM file?
0
Entering edit mode
20 months ago
biostart • 290
Germany

Hello,

Did you ever have experience working with very large files with HOMER? I am using a 400 Gb file as Input to create a HOMER tag dir, and expecting intermediate files of >1Tb created in the process. Just wanted to clarify, how does HOMER read the input, is it reading the whole input file in the memory, or processing by small portions? Please tell me before I kill the university's computer cluster :)

Thanks

RNA-Seq ChIP-Seq • 1.3k views
ADD COMMENTlink
2
Entering edit mode
17 months ago
Netherlands

makeTagDirectory _ basically parses through the alignment file and splits the tags into separate files based on their chromosome. As a result, several *.tags._ _tsv_ _files are created in the output directory. These are made to very efficiently return to the data during downstream analysis. This also helps speed up the analysis of very large data sets without running out of memory._

I think its already taken care of, but in case Chris (cbenner@salk.edu) would be the best to answer this question. He wrote the program, but I reckon it's already customised to be used with the large files and reads them line by line. It's written in Perl.

Otherwise you can always split up the BAM file per chromosome and run makeTagDir parallelly depending upon which cluster you are using, and later on combine the tag dirs into one

To combine tag directories, for example when combining two separate experiments into one, do the following:

makeTagDirectory Combined-PU.1-ChIP-Seq/ -d Exp1-ChIP-Seq/ Exp2-ChIP-Seq/ Exp3-ChIP-Seq/

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1