Biostar Beta. Not for public use.
Question: How do I identify and differentiate between unidirectional and bidirectional promoters
0
Entering edit mode

I have a set of genes that contain a protein of interest at the TSS. I would like to be able to separate these genes into two classes: genes with a unidirectional promoter, and genes with a bidirectional promoter.

I have access to pair-end GRO-Seq data, but no RNA-seq data. Is there a way to do this?

ADD COMMENTlink 4.0 years ago cbio • 410 • updated 4.0 years ago ivivek_ngs ♦ 4.8k
Entering edit mode
0

technically, do you wan to get the 5' reads that go in opposite directions but overlap with each other ( or present with in certain distance, lets say 400bp ?) Like that of enhancerRNAs which transcribe bi directionally ?

ADD REPLYlink 4.0 years ago
geek_y
9.7k
Entering edit mode
0

Yes this is what I'd like to do. I had previously thought I could simply look for overlapping regions of gro-seq neg/pos coverage bedgraphs 1k from annotated TSS's using bedtools, but this did not work.

ADD REPLYlink 4.0 years ago
cbio
• 410
Entering edit mode
0

Do you have a separate files for 5' reads ? When you say paired end data, do you know which reads are originated from 5' of a transcript ?

ADD REPLYlink 4.0 years ago
geek_y
9.7k
Entering edit mode
0

I do not have a separate file for these. What I have currently is a bedtools genomecoverage bedgraph that contains the entire coverage, and is not limited to the -5' option that I generated using:

genomeCoverageBed -bg -strand + -ibam $infile -g $genome > outdir/genomecoveragebed/$outfile3 

genomeCoverageBed -bg -strand - -ibam $infile -g $genome | awk -F '\t' -v OFS='\t' '{ $4 = - $4 ; print $0 }'> $outdir/genomecoveragebed/$outfile4

I'm very new to this GRO-Seq, and the data wasn't generated by my lab so getting information about it's generation has been difficult at best.

ADD REPLYlink 4.0 years ago
cbio
• 410
Entering edit mode
1

If you have paired-End data, somehow you need to separate reads that originated from 5' end. Otherwise you will not be able to find out exactly bidirectional transcripts. Anyway, if you would like to check which of the regions from forward strand are close to regions on reverse strand, you could use the closestBed feature.

closestBed -a Fw_strand.bed -b Rv_strand.bed -d | awk -v OFS="\t" '{ if ($NF<=400) print $1, $2, $3}' | sort -k1,1 -k2,2n | uniq | wc -l

But this won't be exclusive to bidirectional transcripts. Infact, it does not meaningful at all as, in general, paired-end reads maps in fr or rf orientation , so you will definitely end up with may regions that are close to each other on Fw and reverse strand.

Ask the people who generated the data, if they can tell you how to separate reads originated from 5' ends. Then I can tell you how to get bidirectional transcripts.

ADD REPLYlink 4.0 years ago
geek_y
9.7k
0
Entering edit mode

I believe when you extract the list of genes from your data you have the strand specificity right? so then you will be able to understand which genes correspond to which strand be it + or - thus giving you strand specific feature. Then you can grep your output based on strand features.

This will give you two lists of promoters that have either + or - strandedness. Once you have it when you can overlap the genes to see bidirectional genes , since those which will overlap at refeseqIDs or gene symbols should be shared at the level of both strands. I believe this will help.

ADD COMMENTlink 4.0 years ago ivivek_ngs ♦ 4.8k • updated 17 months ago RamRS 21k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0