C++ parse GTF file
4
1
Entering edit mode
8.4 years ago
scchess ▴ 640

I am looking for a library to parse a GTF file in C++. I'm using htslib for SAM/BAM, and it works well. However, I am not aware of any library that can handle GTF format for me. Is there a good library that I can use to parse GTF file, or should implement a parser myself?

gtf • 3.0k views
ADD COMMENT
3
Entering edit mode
8.4 years ago

I have C code for parsing GTF in convert2bed on my Github page: https://github.com/alexpreynolds/convert2bed

The relevant function parses an input buffer into a c2b_gtf_t struct. You could leave out the last line of the conversion function, if you do not need BED output.

ADD COMMENT
2
Entering edit mode
8.4 years ago

What do you need the library to do? I have a C library that can parse a GTF files and create an interval trees of it, if that's what you need.

ADD COMMENT
1
Entering edit mode
8.4 years ago

It isn't a full blown library, but it's a start:

https://github.com/zeeev/coverup/blob/master/lib/genCodeClass.hpp

I'd be happy to pull in changes.

Here is an example of looping over the GTF:

 gcClass gtf ("data/gencode.v23.annotation.gtf");

  gtf.index();
  gtf.loadIndex();

  bool flag = true;

  while(true){
    gene gf;
    gene * transcript;
    flag = gtf.getNextGene(gf);
    if(!flag){
      break;
    }
    if(!gf.isProteinCoding()){
      continue;
    }
    gf.getLongestChild(&transcript);

    if(gf.getSeqid().compare("chr1") != 0){
      break;
   }​

}
ADD COMMENT
1
Entering edit mode
8.4 years ago
dendrov.kan ▴ 20

This is part of our core library, but it is may be helpful for you:
https://local.ugene.unipro.ru/svn/ugene/trunk/src/corelibs/U2Formats/src/GTFFormat.h
https://local.ugene.unipro.ru/svn/ugene/trunk/src/corelibs/U2Formats/src/GTFFormat.cpp

Note the implementation of methods parseDocument and parseAndValidateLine.

ADD COMMENT

Login before adding your answer.

Traffic: 3081 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6