This is a beta test.
Question: Bam to RPM Bedgraph Conversion
0
Entering edit mode

I want to convert my ChIP-seq bam files to an RPM bedgraph file. I want to use the following command:

bedtools genomecov -bg -split -scale 1000000 -trackline -trackopts “name=bam_file_name description=bam_file” -ibam bamfilename > bedgraphfilename

I want to clarify that, for an rpm file, I should use the value one million following the "-scale"? Or should the scale factor be 6 because it assumes a 10 base?

Thanks in advance!

ADD COMMENTlink 20 months ago virlana.shchuka • 0 • updated 16 months ago h.mon 25k
Entering edit mode
0

I tried the scaling factor command and receive the following output: (standard_in) 1: illegal character: \342 (standard_in) 1: illegal character: \200 (standard_in) 1: illegal character: \234 -bash: 1000000/10743694": No such file or directory

I checked the spacing between all the parts of the command and they are correct. Does anyone know what I am doing wrong?

ADD REPLYlink 20 months ago
virlana.shchuka
• 0
3
Entering edit mode

The value of -scale is the one that each value in column 4 of the bg is multiplied with, means you have to calculate the scaling factor externally. Here is a two-liner that can do it:

 ## Scaling factor for single-end data, counting every mapped read (bitwise flag = 0)
 TmpScale=$(bc <<< "scale=6;1000000/$(samtools view -f 0 -c in.bam)")

 ## Now get the actual bedGraph:
 echo '==> RPM scaling factor:' $TmpScale
 bedtools genomecov -bga -ibam in.bam -scale $TmpScale | sort -k1,1 -k2,2n > out.bedGraph

For matters of completeness, there are tools that can output a RPM-normalized bedGraph or bigwig in one go, like deeptools bamCoverage. Still, both genomecov and deeptools are pretty slow. There is a newer tool, mosdepth, which outputs a compressed bedGraph and is lightning fast. The RPM-normalization can be done using something like awk once the bedGraph has been generated.

ADD COMMENTlink 20 months ago ATpoint 17k
Entering edit mode
0

Thank you for your reply.

1) Do I need to download specific packages to carry out these commands? 2) Can I just lift over these commands, or are all the arrow characters you've used unimportant?

ADD REPLYlink 20 months ago
virlana.shchuka
• 0
Entering edit mode
0

Would it be possible for you to explain what exactly your first command is doing?

ADD REPLYlink 16 months ago
m93
• 150
Entering edit mode
1

It counts total number of mapped reads in the bam file, and uses it to calculate a per-million scaling factor. Still, I recommend deeptools bamCoverage for creating browser tracks. Is offers many options that are quiet convinient.

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

Thanks, makes sense. Last question: how would it work for paired-end data?

ADD REPLYlink 16 months ago
m93
• 150
Entering edit mode
1

Thinking about it again, for single end, I would probably do -f 0 -F 256 (read mapped, and not supplementary alignment), and for paired, -f 1 -F 256 (read paired, but not supplementary). Bitwise flags explained here.

ADD REPLYlink 16 months ago
ATpoint
17k
Entering edit mode
0

Sorry to bother you again, but do you happen to know if bamCoverage requires for your input Bam file to be sorted? It doesn't look like but I can't seem to find a definitive answer anywhere.. It definitely seems to want a .bai to be present in the same directory as your bam file.

ADD REPLYlink 16 months ago
m93
• 150
Entering edit mode
1

It requires indexed BAM, and indexing requires sorted BAM, so yes bamCoverage requires sorted BAM. Use samtools sort to sort your bam and then samtools index to index the sorted bam.

ADD REPLYlink 16 months ago
ATpoint
17k

Login before adding your answer.

Powered by the version 1.6