Question

Understanding columns of ChIP-Seq BED file format

0

Entering edit mode

4.7 years ago

singram • 0

I am trying to analyze some ChIP-Seq Analysis of H3K27ac BED data and cannot for the life of me see anywhere where it lists what the final two columns are. I understand that the first four are: chr, start, stop, name. However, I am really stumped on the final two columns and cannot find anywhere on the GEO Accession viewer where it states the column headers. I am quite new to the BED file format, so I wanted to check if there is a convention I am simply missing. Here is a link to the GEO Accession viewer page: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM521887

Thanks in advance for any help.

Edit#1 (example lines):

chr1 713462 713661 SOLEXA2_1:1:69:1236:763 -

chr1 713719 713918 SOLEXA2_1:1:16:1735:216 -

chr1 713724 713923 SOLEXA2_1:1:30:1080:1399 -

chr1 713738 713937 SOLEXA2_1:1:70:1407:1946 -

ChIP-Seq sequencing genome BED • 3.3k views

ADD COMMENT • link updated 3.7 years ago by Biostar 20 • written 4.7 years ago by singram • 0

1

Entering edit mode

Please show example data from this BED. People might be more helpful if you do not require them to download data first.

ADD REPLY • link 4.7 years ago by ATpoint 82k

0

Entering edit mode

Thanks. I have updated the post.

ADD REPLY • link 4.7 years ago by singram • 0

score 2 · Answer 1 · 2019-09-02

2

Entering edit mode

4.7 years ago

ATpoint 82k

That appears to be the mapped reads in BED format. The 4th column is the read name and the 5th the strand it is mapped to (which is meaningless in ChIP-seq as this is not a stranded experiment). With this file you could call peaks e.g. with macs2.

ADD COMMENT • link 4.7 years ago by ATpoint 82k

0

Entering edit mode

Do you want to put that in an answer? I've come to the same conclusion: col4 is read name, col5 is strand (there's about ~5m '+' and ~5m '-' in this file), which is meaningless for ChIP-seq.

ADD REPLY • link 4.7 years ago by graeme.thorn ▴ 100