how to extract the promoter region from featureCounts output?
1
0
Entering edit mode
6.7 years ago
shoujun.gu ▴ 380

Here is what I plan to do:

  1. map reads to ref
  2. count reads by featureCounts
  3. get DEG by edgeR
  4. extract -1000 bp promoter region of DEG

now I'm at step 2. After I get the count file, I found some genes have more than one start sites in the 'Start' column in featureCounts output file. Like:

Geneid Chr Start End

4933401J01Rik chr1 3073253 3074322

Gm26206 chr1 3102016 3102125

Xkr4 chr1;chr1;chr1;chr1;chr1;chr1;chr1 3205901;3206523;3213439;3213609;3214482;3421702;3670552 3207317;3207317;3215632;3216344;3216968;3421901;3671498

Gm18956 chr1 3252757 3253236

If each gene has just 1 start site in the 'start' column, i think I could extract the promoter region by using bedtools. But since some genes have more than 1 TSS (eg. Xkr4), how to extract all the promoter regions from them? any suggestions?

Thanks.

RNA-Seq next-gen gene • 2.5k views
ADD COMMENT
0
Entering edit mode
6.7 years ago

If you are quentifying genes using featureCounts, you will have only one line per gene in your output. Can you post your featureCounts command and head of your GTF ?

There is no standard definition of promoter unless you have histone modification data ( K4Me3 ) or open-chromatin regions to define a promoter. If you see multiple promoters for a gene, probably they are different transcripts. Depending on your goal, either you can consider all the promoters i.e +/- 500bp ( core promoter ) of all TSS or consider one promoter of the transcript that shows highest expression.

ADD COMMENT
0
Entering edit mode

I updated my post.

yes, it looks like they are different transcripts. My question is: 1. how do I determine which transcript is dominant? 2. if I want to extract all the promoter region sequence, are there any easy way to do it?

ADD REPLY

Login before adding your answer.

Traffic: 2627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6