Understanding columns of ChIP-Seq BED file format
1
0
Entering edit mode
4.7 years ago
singram • 0

I am trying to analyze some ChIP-Seq Analysis of H3K27ac BED data and cannot for the life of me see anywhere where it lists what the final two columns are. I understand that the first four are: chr, start, stop, name. However, I am really stumped on the final two columns and cannot find anywhere on the GEO Accession viewer where it states the column headers. I am quite new to the BED file format, so I wanted to check if there is a convention I am simply missing. Here is a link to the GEO Accession viewer page: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM521887

Thanks in advance for any help.

Edit#1 (example lines):

chr1 713462 713661 SOLEXA2_1:1:69:1236:763 -

chr1 713719 713918 SOLEXA2_1:1:16:1735:216 -

chr1 713724 713923 SOLEXA2_1:1:30:1080:1399 -

chr1 713738 713937 SOLEXA2_1:1:70:1407:1946 -

ChIP-Seq sequencing genome BED • 3.3k views
ADD COMMENT
1
Entering edit mode

Please show example data from this BED. People might be more helpful if you do not require them to download data first.

ADD REPLY
0
Entering edit mode

Thanks. I have updated the post.

ADD REPLY
2
Entering edit mode
4.7 years ago
ATpoint 82k

That appears to be the mapped reads in BED format. The 4th column is the read name and the 5th the strand it is mapped to (which is meaningless in ChIP-seq as this is not a stranded experiment). With this file you could call peaks e.g. with macs2.

ADD COMMENT
0
Entering edit mode

Do you want to put that in an answer? I've come to the same conclusion: col4 is read name, col5 is strand (there's about ~5m '+' and ~5m '-' in this file), which is meaningless for ChIP-seq.

ADD REPLY

Login before adding your answer.

Traffic: 1792 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6