error when run tximport for salmon files
2
2
Entering edit mode
6.5 years ago
Lila M ★ 1.2k

Hi guys, I'm trying to analyze some RNA-seq data using salmon as follow:

#create the index:
salmon index -t gencode.v27.transcripts.fa -i human_index

#cretae the quant.sf files:
salmon quant -i human_index/ -l OSR -1 R1.fastq -2 R2.fastq -o salmon_quant

After that, my idea is to process all the files (1Q_S1_quant.sf, 2Q_S2_quant.sf .....16Q_S16_quant.sf) in R for downstream analysis with DESeq2, to do that I've tried:

library(GenomicFeatures)
library(tximport)
library(readr)
library(rjson)

## Create a transcript-to-gene matching table (tx2gene) that will be used to aggregate transcript quantifications 
## Salmon to the gene level

txdb <-makeTxDbFromGFF("gencode.v27.annotation.gtf")
k <- keys(txdb, keytype = "GENEID")
df <- select(txdb, keys = k,  columns = "TXNAME", keytype = "GENEID")
tx2gene <- df[, 2:1]
head(tx2gene)

## load salmon files
files <- list.files( pattern = "quant.sf",full.names = TRUE)
names(files) <- paste0("sample", 1:16)
all(file.exists(files))
#TRUE

txi_salmon <- tximport(files = files, type = "salmon", txOut = FALSE, tx2gene = tx2gene)reading in files with read_tsv
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
Error in summarizeToGene(txi, tx2gene, ignoreTxVersion, countsFromAbundance) : 
    None of the transcripts in the quantification files are present
  in the first column of tx2gene. Check to see that you are using
  the same annotation for both.

But that is not true at all, because I look in both files (quant.sf and tx2gene) and the same transcript for the same gene is present in both files (eg):

#tx2gene
TXNAME                    GENEID
ENST00000373031.4   ENSG00000000005.5
ENST00000485971.1   ENSG00000000005.5

#1Q_S1.quant.sf
ENST00000373031.4|ENSG00000000005.5|OTTHUMG00000022001.1|OTTHUMT00000057481.1|TNMD-201|TNMD|1339|protein_coding|    1339    1156.86 0   0
ENST00000485971.1|ENSG00000000005.5|OTTHUMG00000022001.1|OTTHUMT00000057482.1|TNMD-202|TNMD|542|processed_transcript|   542 360.895 0   0

Any suggestions about what's going on with this funny error?

Thanks!

RNA-Seq salmon quantification R error txtimport • 6.7k views
ADD COMMENT
1
Entering edit mode

Hint: compare the first columns of the two files your posted. You'll note that they're not exactly the same. That's causing the error.

ADD REPLY
0
Entering edit mode

Hi Devon, can you explain how can I solve it? Thanks!

ADD REPLY
1
Entering edit mode

You can probably do something like sed -e 's/\|.*\t/\t/' 1Q_S1.quant.sf.

ADD REPLY
2
Entering edit mode
6.5 years ago
e.rempel ★ 1.1k

It looks like they are different after all, since ENST00000373031.4 is not ENST00000373031.4|ENSG00000000005.5|OTTHUMG00000022001.1|OTTHUMT00000057481.1|TNMD-201|TNMD|1339|protein_coding|. You could split the names in XXX.quant.sf files using limma::strsplit2 using "|" as separator.

ADD COMMENT
0
Entering edit mode

I'm a bit stuck here, can you please let me know how to do that? or at which point? Thanks!

ADD REPLY
1
Entering edit mode

After you have checked that all files are here, you could do something like

rownames(1Q_S1.quant.sf) <- limma::strsplit2(rownames(1Q_S1.quant.sf), split = "|", fixed = T)[,1])

meaning that you split your rownames taking | as separator and then take only the first entry

ADD REPLY
0
Entering edit mode

I have a follow up question...

If I'm using file.path to import all my quant.sf files into R, is there a way of correcting this space issue for all files? I'm getting the same error message and I know it is because of the lack of a space between my transcript_id and the "|".

 dir <- "/mnt/data/BM/Total_RNAseq/salmon/protein_coding"

    files <- file.path(dir, samplefile$sampleID, "quant.sf")

    annotation_transcript <- elementMetadata(import(gtf_file, feature.type = "transcript"))

    tx2gene <- annotation_transcript[,c("transcript_id", "gene_id")]

    txi.salmon <- tximport(files, type = "salmon", tx2gene = tx2gene)

Thanks in advance!

ADD REPLY
1
Entering edit mode
6.5 years ago
Lila M ★ 1.2k

problem solved! Thanks!

ADD COMMENT

Login before adding your answer.

Traffic: 1855 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6