Question

Using 'tximport' library for downstream DGE after quantifying with Kallisto

0

Entering edit mode

8.0 years ago

Sinji ★ 3.2k

I'm quite new to RNA-sequencing and am playing around with data to get a handle on it. I have quantified with Kallisto and am using tximport to summarize transcript counts for differential gene expression analysis.

I am running into a problem associating gene ID's with my transcripts for the summarization portion. I believe that the likely cause is the actual TxDb library I am using and that it may be different from the transcriptome file I used, but I am not sure and my attempts at solving this haven't been successful.

I am working with human samples. I quantified my transcripts using this transcriptome file for homo sapiens. I have 6 samples, 3 WT replicates, and 3 KO replicates.

I created a vector pointing to my kallisto files as detailed in the tximport manual.

files <- file.path(dir, "kallisto", samples$run, "abundance.tsv")
I created a data.frame from a TxDb object to construct the tx2gene table.

library(TxDb.Hsapiens.UCSC.hg38.knownGene)

txdb <- TxDb.Hsapiens.UCSC.hg38.knownGene

k <- keys(txdb, keytype = "GENEID")

df <- select(txdb, keys = k, keytype = "GENEID", columns = "TXNAME")

tx2gene <- df[, 2:1] # tx ID, then gene ID

But head(tx2gene) produces:

TXNAME GENEID
1 uc002qsd.4      1
2 uc002qsf.2      1
3 uc003wyw.1     10
4 uc002xmj.3    100
5 uc010xbn.1   1000
6 uc002kwg.2   1000

This obviously isn't right.

Using tximport's tximport function.

library(tximport)

library(readr)

txi <- tximport(files, type = "kallisto", tx2gene = tx2gene, reader = read_tsv)

names(txi)

Does the following:

txi $abundance

sample 1 sample 2 sample 3 sample 4 sample 5 sample 6

$counts

sample 1 sample 2 sample 3 sample 4 sample 5 sample 6

$length

sample 1 sample 2 sample 3 sample 4 sample 5 sample 6

$countsFromAbundance

[1] "no"

and head(txi$counts):

head(txi$counts)

sample 1 sample 2 sample 3 sample 4 sample 5 sample 6

I'm not completely sure what i'm doing incorrectly. I'll give it another shot after lunch, it might just be the frustration at this point but any help is appreciated.

RNA-Seq Kallisto tximport R • 5.1k views

ADD COMMENT • link updated 8.0 years ago by Michael Love ★ 2.6k • written 8.0 years ago by Sinji ★ 3.2k

score 2 · Accepted Answer · 2016-04-19

hi,

I'm the tximport maintainer. If you have software problems, could you post them to http://support.bioconductor.org?

Biostars is a great forum, but it would take extra time for me to go around and check many sites for potential software issues, so I generally only check the Bioc support site now. It is actually a fork of the Biostars software, dedicated to getting responses from Bioconductor software maintainers. Feel free to ignore this though if you only want to use Biostars.

Then, regarding cross-posting on both forums, I'm ok with this as long as the poster makes it clear that they have done so and adds links to both posts so other users can see answers on the other forum.