Biostar Beta. Not for public use.
Find the 3' utrs, 5'utrs and their counts from bam files
0
Entering edit mode
18 months ago

Hello,

I have bam files of 8 samples (4 normal and 4 diseased), produced by alignment with novoalign (small-rna sequencing data). I have excluded the mirna from the bam files by using the following command:

bedtools intersect -v "sample.bam" "hg19_mirna.gff3" > output.bam


In this manner I have excluded the miRNA from all the samples. Now I want to find the utr (3' and 5') sequences present in the resultant bam files, and the read counts of each of the utr sequences.

Could anyone suggest a way to do this?

0
Entering edit mode

You can try featureCounts with meta-feature "UTR" level.

0
Entering edit mode

Thank you. Actually I want to get a gtf file of the utr sequences, and then count the reads using htseq-count. But I am unable to find the gtf files of 3' and 5' utrs. Do you know where I could get it?

0
Entering edit mode

at ensembl you have gtf files for many organisms. They contain "UTR" metadata (I know for human and mouse).

0
Entering edit mode

Hi, I am currently using hg19 database (Grch37 version). I am unable to find the gtf / gff3 files of the utrs of this version. Could you please link me to them? Thanks a lot

0
Entering edit mode

I recommend you upgrade to hg38! Or search in the archives somewhere.

1
Entering edit mode
19 months ago
MPI IE, Freiburg, Germany

Using biomart at ensembl you can get the 3' and 5' UTR of hg19 (image below) at this address:

https://grch37.ensembl.org/biomart/martview/

Then I guess you can get the read coverage from these sequences using for example deepTools multiBamSummary BED-file