Specific sequence abundance using RNA-Seq
2
0
Entering edit mode
6.8 years ago
user230613 ▴ 360

Hi,

I would like to measure the expression, the abundance, of a given sequence, a specific 9-mer. I have RNA-Seq data. I know that the 9-mer is not unique in the genome, is not private of a specific gene. It is present in more than one isoform of gene A and also is present in gene B. How can I get the final number (TPM, FPKM) of the specific expression of the 9-mer?

I hope that the question is understandable:)

expression RNA-Seq • 1.4k views
ADD COMMENT
3
Entering edit mode
6.8 years ago

To find the expression of a specific 9-mer, "ACGTACGTA", using BBMap:

kmercountexact.sh in=reads.fq out=kmers.fa k=9
bbduk.sh in=kmers.fa outm=filtered.fa k=9 mm=f literal=ACGTACGTA

"filtered.fa" should contain exactly one entry, something like:

>6721
ACGTACGTA

...though it might come out reverse-complemented. The number is the number of times it occured in the file.

ADD COMMENT
0
Entering edit mode

@Brian, I want to measure the expression of that sequence using RNA-Seq data, I don't want to extract the Kmer sequence from my reads.

ADD REPLY
0
Entering edit mode

We might be miscommunicating... in my view the number resulting from this method is the expression of that sequence in the RNA-Seq data. I'm not sure that it makes much sense to translate it to FPKM, though.

ADD REPLY
0
Entering edit mode
6.8 years ago

The closest you could get it to replace the lengths that appear in the formula above with the number of times the k-mer appears in each transcript then apply the formula as usual.

ADD COMMENT

Login before adding your answer.

Traffic: 1487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6