Tool:New ultraperformant version of SICER/epic: epic2
1
6
Entering edit mode
5.4 years ago
endrebak852 ▴ 110

New version of the popular ChIP-Seq caller SICER out.

See https://github.com/biocore-ntnu/epic2 for more.

Performance:

enter image description here

CLI:

usage: epic2 [-h] [--treatment TREATMENT [TREATMENT ...]]
             [--control CONTROL [CONTROL ...]] [--genome GENOME]
             [--keep-duplicates] [--bin-size BIN_SIZE]
             [--gaps-allowed GAPS_ALLOWED] [--fragment-size FRAGMENT_SIZE]
             [--false-discovery-rate-cutoff FALSE_DISCOVERY_RATE_CUTOFF]
             [--effective-genome-fraction EFFECTIVE_GENOME_FRACTION]
             [--chromsizes CHROMSIZES] [--e-value E_VALUE] [--quiet]
             [--example]

epic2. (Visit github.com/endrebak/epic2 for examples and help.)

optional arguments:
  -h, --help            show this help message and exit
  --treatment TREATMENT [TREATMENT ...], -t TREATMENT [TREATMENT ...]
                        Treatment (pull-down) file(s) in one of these formats:
                        bed, bedpe, bed.gz, bedpe.gz or (single-end) bam, sam.
                        Mixing file formats is allowed.
  --control CONTROL [CONTROL ...], -c CONTROL [CONTROL ...]
                        Control (input) file(s) in one of these formats: bed,
                        bedpe, bed.gz, bedpe.gz or (single-end) bam, sam.
                        Mixing file formats is allowed.
  --genome GENOME, -gn GENOME
                        Which genome to analyze. Default: hg19. If
                        --chromsizes and --egf flag is given, --genome is not
                        required.
  --keep-duplicates, -kd
                        Keep reads mapping to the same position on the same
                        strand within a library. Default: False.
  --bin-size BIN_SIZE, -bin BIN_SIZE
                        Size of the windows to scan the genome. BIN-SIZE is
                        the smallest possible island. Default 200.
  --gaps-allowed GAPS_ALLOWED, -g GAPS_ALLOWED
                        This number is multiplied by the window size to
                        determine the number of gaps (ineligible windows)
                        allowed between two eligible windows. Must be an
                        integer. Default: 3.
  --fragment-size FRAGMENT_SIZE, -fs FRAGMENT_SIZE
                        (Single end reads only) Size of the sequenced
                        fragment. Each read is extended half the fragment size
                        from the 5' end. Default 150 (i.e. extend by 75).
  --false-discovery-rate-cutoff FALSE_DISCOVERY_RATE_CUTOFF, -fdr FALSE_DISCOVERY_RATE_CUTOFF
                        Remove all islands with an FDR below cutoff. Default
                        0.05.
  --effective-genome-fraction EFFECTIVE_GENOME_FRACTION, -egf EFFECTIVE_GENOME_FRACTION
                        Use a different effective genome fraction than the one
                        included in epic2. The default value depends on the
                        genome and readlength, but is a number between 0 and
                        1.
  --chromsizes CHROMSIZES, -cs CHROMSIZES
                        Set the chromosome lengths yourself in a file with two
                        columns: chromosome names and sizes. Useful to analyze
                        custom genomes, assemblies or simulated data. Only
                        chromosomes included in the file will be analyzed.
  --e-value E_VALUE, -e E_VALUE
                        The E-value controls the genome-wide error rate of
                        identified islands under the random background
                        assumption. Should be used when not using a control
                        library. Default: 1000.
  --quiet, -q           Do not write output messages to stderr.
  --example, -ex        Show the paths of the example data and print example command.
ChIP-Seq • 3.7k views
ADD COMMENT
0
Entering edit mode

Hi endrebak

I hope my question will not disturb on this topic. I would like to use epic2 on my chip seq dataset but i have an error. I have the same when i run the example :

epic2 -t /usr/local/lib/python3.7/dist-packages/epic2/examples/test.bed.gz -c /usr/local/lib/python3.7/dist-packages/epic2/examples/control.bed.gz > deleteme.txt Found a median readlength of 25.0

Using genome hg19.

Using an effective genome length of ~2510 * 1e6

Parsing ChIP file(s): /usr/local/lib/python3.7/dist-packages/epic2/examples/test.bed.gz terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Abandon

Do you have any idea what's wrong?

Thanks

ADD REPLY
0
Entering edit mode

I suggest you open an issue at the GitHub repository of this tool: https://github.com/biocore-ntnu/epic2

ADD REPLY
0
Entering edit mode

I would love to try to help you. Can you post an issue on the repo?

ADD REPLY
1
Entering edit mode
5.1 years ago
endrebak ▴ 960

epic2 has been accepted into bioinformatics. I've also implemented the SICER-df and SICER-rb-df algorithms for differential enrichment. Will update with citation info later :)

ADD COMMENT
1
Entering edit mode

Hi, endrebak

I was wondering if epic2 is specific for dispersed histone mark or there is any way to call sharp mark peak with epic2?

Thanks! Kun

ADD REPLY
0
Entering edit mode

Macs2 is better for sharp peaks :) I've used epic2 for medium marks like PolII and H3K4me3, but not TFs with very sharp peaks.

ADD REPLY

Login before adding your answer.

Traffic: 2407 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6