Get The Reads Values From A Bam File
3
0
Entering edit mode
12.5 years ago
Jesse ▴ 10

Can anyone help me reading the reads values from a BAM file? I need to analyse a bam file with java and PICARD, retrieving this information,but unfortunately I have no idea how to do it.

It's quite a simple, I have only to get the name and the total reads values from each gene to do some calculus with it.

Thanks in advance

bam java • 6.0k views
ADD COMMENT
2
Entering edit mode
12.5 years ago
toni ★ 2.2k

I have no experience with Picard Java API (only the tools) but if this is not compulsory to use Picard & Java you could use bamtools C++ API to parse the whole BAM file and retrieve the information you want. Here follows a sample code to parse the BAM file, and where you can add your own specific processing:

//let's call this file do_bam.cpp

#include <iostream>
#include "api/BamReader.h"
#include "api/BamWriter.h"

using namespace std;
using namespace BamTools;

int main(int argc, char* argv[])
{
    string  inputFileName;

    if ( argc == 2 ) 
    {
            inputFileName  = argv[1] ;
    }
    else {
        cerr << "Wrong number of arguments." << endl;
        return 1;
    }

    BamReader reader;
    if (!reader.Open(inputFileName)) {
        cerr << "Could not open input BAM file." << endl;
        return 1;
    }

    BamAlignment al;
    while (reader.GetNextAlignment(al)) {

        // Add your own processing here
        // to get the desired information

    }

reader.Close();
return 0;
}

Compile with :

c++ -I /path/to/bamtools/include/ -L /path/to/bamtools/lib/ -o do_bam do_bam.cpp -lz -lbamtools

Documentation of BamTools API classes :

Beside this, you may need a bed file with the genomic intervals of your genes of interest.

ADD COMMENT
1
Entering edit mode
12.5 years ago
import java.io.File;
import net.sf.samtools.SAMFileReader;
import net.sf.samtools.SAMFileWriter;
import net.sf.samtools.SAMFileWriterFactory;
import net.sf.samtools.SAMRecord;
import net.sf.samtools.SAMRecordIterator;
import net.sf.samtools.SAMFileReader.ValidationStringency;

public class Biostar13454 {

static java.util.List<YourClassGene> my_list_of_genes;

public static void main(String[] args) throws Exception
    {
    if(args.length!=1) return;
    //init  my_list_of_genes ...
    SAMFileReader inputSam=new SAMFileReader(new File(args[0]));
    inputSam.setValidationStringency(ValidationStringency.SILENT);
    SAMRecordIterator iter=inputSam.iterator();

    while(iter.hasNext())
        {
        SAMRecord rec=iter.next();
        if(rec.getReadUnmappedFlag()) continue;
        for(YourClassGene g: my_list_of_genes)
           {
           if(!(rec.getReferenceName().equals(g.chromosome) && g.contains(rec.getAlignmentStart(),rec.getAlignmentStart())) continue;
           //do something
           }
        }
    iter.close();
    inputSam.close();
    }
 }
ADD COMMENT
0
Entering edit mode
12.4 years ago

"I have only to get the name and the total reads values from each gene to do some calculus with it."

Sounds like you might just want to do something simple - get count of reads for each gene from a bam file?

If I interpreted your question correctly, read about coverageBed from bedtools, it will take a bam file and a bed file of genes and give you count of reads on each gene.

ADD COMMENT
0
Entering edit mode

Yes that it is what I wanna do. I read about the bedtools and have none idea how I can use it with the data coming from my bam files.

can you help me with some tips?

ADD REPLY
0
Entering edit mode

There's a very good manual.

https://docs.google.com/viewer?url=http://bedtools.googlecode.com/files/BEDTools-User-Manual.v4.pdf

coverageBed -abam myfile.bam -b genes.bed > counts.txt

ADD REPLY

Login before adding your answer.

Traffic: 3082 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6