Hello,
I am trying to make a coverage histogram using ClicO (Circos browser interface), however, I have too many lines. My data looks is in the form of a .txt, and was produced using bedtools genomecov function. It looks like this.
NC_009636.1 0 5 0
NC_009636.1 5 25 40
NC_009636.1 25 26 30
NC_009636.1 26 35 0
NC_009636.1 35 36 10
NC_009636.1 36 37 230
NC_009636.1 37 39 240
NC_009636.1 39 40 250
NC_009636.1 40 41 260
...
With a column for the chromosome data, start and stop coordinates, and coverage
I have over 300,000 lines, which I would like to bring down to below 25,000, preferable by increasing the bin width, which at the moment, are the start stop coordinates, some of them a single nucleotide long. I am trying to think of a way to do this in bash or R, and perhaps have bins be 100kb or so.
Best,
J
Please use the format bar and especially the code option (
10101
) to highlight code and data examples. I did it for you this time.