Is there a simple way to get the max coverage of a bedgraph file in python?
2
0
Entering edit mode
8.1 years ago
rioualen ▴ 710

Everything is in the question! Basically I guess I should load the bedgraph file and get the maximum value from the coverage column, however I can't seem to get this done.

I've tried using numpy:

>>> np.loadtxt('GSM1470159_sickle-se-q20_bwa.bedgraph', usecols=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 722, in loadtxt
    usecols = list(usecols)
TypeError: 'int' object is not iterable

This kind of stuff is a lot easier to deal with in R, however it looks super complicated to use R in python for such a simple task: https://sites.google.com/site/aslugsguidetopython/data-analysis/pandas/calling-r-from-python.

I must be missing something here, if you can help me thanks in advance!

python bedgraph • 2.2k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Oh thanks, indeed! In the meantime I found another solution with pandas (see under).

ADD REPLY
0
Entering edit mode
8.1 years ago
rioualen ▴ 710

Seems I found an easier way of doing it with pandas library:

import pandas as pd
tab = pd.read_table("file.bedgraph")
cov = tab.iloc[:,3]
max = int(cov.max())
ADD COMMENT
0
Entering edit mode
8.1 years ago

You could do a reverse-numerical sort on the fourth column and pull off the first value:

$ cut -f4 foo.bedgraph | sort -nr | head -1 > answer.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6