GTF file for HIV strain pNL4-3
0
1
Entering edit mode
5.8 years ago
caggtaagtat ★ 1.9k

Hi,

I have RNA-seq data of HIV infected cells, which I now want to map to a mixed human-HIV genome. For the creation of that genome, I need the GTF file of my HIV strand. I din't find strain specific annotation files for HIV. Do you maybe know where one could find something like that, or a better way to evaluate transcript abundance of HIV in RNA-seq data?

Ok I thought I could convert my annotations in genius by hand in the gff text file, to convert it to a GTF file, but I am very uncertain, if my annotations a sufficient for that.

My GFF file looks like this:

pNL4-3  Geneious    region  1   9709    .   +   0   Is_circular=true
pNL4-3  Geneious    insertion   1186    1186    .   +   .   Name=p17/p24
pNL4-3  Geneious    polyA_signal    9602    9607    .   +   .   Name=POLY_A
pNL4-3  Geneious    LTR 9076    9709    .   +   .   Name=3'_LTR
pNL4-3  Geneious    LTR 1   634 .   +   .   Name=5'_LTR
pNL4-3  Geneious    invisible_Parent    8888    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346934513538.20
pNL4-3  Geneious    invisible_Parent    5304    8887    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346934513382.19
pNL4-3  Geneious    misc_feature    5005    5034    .   .   .   Name=Fragment3
pNL4-3  Geneious    misc_feature    5743    5744    .   +   .   Name=JNCTN_NY5/LAV
pNL4-3  Geneious    repeat_region   454 551 .   +   .   Name=R
pNL4-3  Geneious    repeat_region   9529    9626    .   +   .   Name=R
pNL4-3  Geneious    repeat_region   552 634 .   +   .   Name=U5
pNL4-3  Geneious    intron  744 5776    .   +   .   Name=TAT/REV/NEF_I
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=TAT_II
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=TAT/REV/NEF_II
pNL4-3  Geneious    intron  6045    8368    .   +   .   Name=REV_II
pNL4-3  Geneious    CDS 2085    5096    .   +   .   Name=POL
pNL4-3  Geneious    CDS 5969    8643    .   +   .   Name=REV
pNL4-3  Geneious    CDS 5830    8414    .   +   .   Name=TAT
pNL4-3  Geneious    CDS 6221    8785    .   +   .   Name=ENV
pNL4-3  Geneious    CDS 790 2292    .   +   .   Name=GAG
pNL4-3  Geneious    CDS 8787    9407    .   +   .   Name=NEF
pNL4-3  Geneious    CDS 5041    5619    .   +   .   Name=VIF
pNL4-3  Geneious    CDS 5559    5849    .   +   .   Name=VPR
pNL4-3  Geneious    CDS 6061    6306    .   +   .   Name=VPU
pNL4-3  Geneious    splicing signal 5059    5060    .   +   .   Name=SD2b
pNL4-3  Geneious    splicing signal 4963    4964    .   +   .   Name=SD2
pNL4-3  Geneious    splicing signal 5974    5975    .   +   .   Name=SA5
pNL4-3  Geneious    splicing signal 6720    6721    .   +   .   Name=(SD5)
pNL4-3  Geneious    splicing signal 744 745 .   +   .   Name=SD1
pNL4-3  Geneious    splicing signal 6045    6046    .   +   .   Name=SD4
pNL4-3  Geneious    splicing signal 5388    5389    .   +   .   Name=SA2
pNL4-3  Geneious    splicing signal 8367    8368    .   +   .   Name=SA7
pNL4-3  Geneious    splicing signal 5464    5465    .   +   .   Name=SD3
pNL4-3  Geneious    splicing signal 5775    5776    .   +   .   Name=SA3
pNL4-3  Geneious    splicing signal 5952    5953    .   +   .   Name=SA4a
pNL4-3  Geneious    splicing signal 5934    5935    .   +   .   Name=SA4c
pNL4-3  Geneious    splicing signal 5958    5959    .   +   .   Name=SA4b
pNL4-3  Geneious    splicing signal 4911    4912    .   +   .   Name=SA1
pNL4-3  Geneious    splicing signal 6602    6603    .   +   .   Name=(SA6)
pNL4-3  Geneious    invisible_Parent    5786    7812    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938163915.21
pNL4-3  Geneious    invisible_Parent    7813    15494   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938164320.22
pNL4-3  Geneious    invisible_Parent    5304    7812    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938256243.23
pNL4-3  Geneious    invisible_Parent    7813    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938256306.24
pNL4-3  Geneious    invisible_Parent    639 5785    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938340538.25
pNL4-3  Geneious    invisible_Parent    5786    10347   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938340632.26
pNL4-3  Geneious    invisible_Parent    639 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938465400.27
pNL4-3  Geneious    invisible_Parent    5304    10347   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938465494.28
pNL4-3  Geneious    invisible_Parent    712 5785    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938540694.29
pNL4-3  Geneious    invisible_Parent    5786    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938540787.30
pNL4-3  Geneious    invisible_Parent    712 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938613678.31
pNL4-3  Geneious    invisible_Parent    5304    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346938613756.32
pNL4-3  Geneious    invisible_Parent    5786    8465    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941525286.33
pNL4-3  Geneious    invisible_Parent    8466    15494   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941525411.34
pNL4-3  Geneious    invisible_Parent    5304    8465    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941600376.35
pNL4-3  Geneious    invisible_Parent    8466    15012   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1346941600470.36
pNL4-3  Geneious    invisible_Parent    712 5303    .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1347027336363.0
pNL4-3  Geneious    invisible_Parent    5304    10420   .   +   .   Name=GvHzdFvgSWWztDH65o8llFeG9ws.1347027336628.1

Do I have to look up the exon borders and insert them manually? Shoudl I delete the first line and do I have to delete the splice signal entries?

Do you maybe know another way to get to e.g. an exemplary HIV GTF file fro comaprison? Or even the one I need?

HIV annotation mapping • 2.3k views
ADD COMMENT
1
Entering edit mode

If you have the genbank file you could try using a genbank2gtf type program to make one up. Here is one repo.

ADD REPLY
0
Entering edit mode

Thank you! I have the annotation in genious and can download the GFF file from there. I just have to convert it then, which I guess can be done by hand, since the file is not that large.

ADD REPLY
0
Entering edit mode

Hi, caggtaagtat ,

I wonder if your HIV NL4-3 GFF/GTF file works? I have the same question and I could not find GFF/GTF of NL4-3 despite intensive google search.

Best,

Xiao

ADD REPLY
0
Entering edit mode

Sequence for HIV NL4-3 is available here. You could download the genbank format file and then try to make the GTF file.

ADD REPLY
0
Entering edit mode

The GTF file should contain all the transcripts of NL4-3, not just the DNA sequences. There are no such annotations of NL4-3 transcripts on the Internet.

ADD REPLY

Login before adding your answer.

Traffic: 1454 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6