Salmon Quantification in Alignment based-mode
0
0
Entering edit mode
4 weeks ago
Patadu94 • 0

Hi All,

I am new to the RNAseq analysis and I am at the moment trying to quantify TPM using Salmon. I read the manual and decided to go for the quantification alignment based mode. As far as I understood, with this method, I have to align my transcripts to the transcriptome. For the transcript I converted the .gff file in fastaq using gffread. Then I obtained the transcriptome using STAR and the ---quantMode TranscriptomeSAM and --outSAMtype BAM SortedByCoordinate commands. However, when I do use the Salmon command present on the manual I get these errors:

[2024-03-27 09:57:03.269] [jointLog] [warning] Transcript "*****" appears in the reference but did not appear in the BAM

[2024-03-27 09:57:03.298] [jointLog] [critical] Transcript "****" appeared in the BAM header, but was not in the provided FASTA file

This is the code I used for quantification:

cd /media/scratchpad_01/guest21

/media/bulk_01/users/guest21/miniconda3/bin/salmon quant -t /media/scratchpad_01/guest21/path/to//file_gtf.fa \
-l A \
-a  /media/scratchpad_01/guest21/path/to/Aligned.toTranscriptome.out.bam \
-o /media/scratchpad_01/guest21/Output_RNAseq_30mpi

I have to say that I am working on the cluster of my university and dowloaded the transcript/genome files from NCBI.

Salmon RNA-seq TPM • 429 views
ADD COMMENT
0
Entering edit mode

Did you check the fasta headers in your fasta file and the reference names that are appearing in the BAM? Looks like they are not matching.

ADD REPLY
0
Entering edit mode

Indeed, the FASTA file and the BAM file have two different heades. In the FASTA some characters are added. Is there a way I could make this two files similar?

ADD REPLY
0
Entering edit mode

I always wondered why people do alignment-based mode. Just use salmon directly on your fastq files and quantify against the transcriptome (see manual of salmon) -- no advantage to me with this alignment-based mode. Just more steps to perform.

ADD REPLY
0
Entering edit mode

To be honest, I find it difficult to understant on how to do it. I have to create a decoy file following the generateDecoyTranscriptome.sh but that seems a difficult task to me.

ADD REPLY
0
Entering edit mode

While using a decoy is recommended it is not necessary, especially if you are running into problems. Just use the mapping based mode as described here: https://salmon.readthedocs.io/en/latest/salmon.html#quantifying-in-mapping-based-mode

ADD REPLY
0
Entering edit mode

Thanks, in the while I could make a decoy using the following code:

grep "^>" <(gunzip -c /path/to/file.fa.gz) | cut -d " " -f 1 > path/to/decoys.txt

However, when I run this code:

/media/bulk_01/users/guest21/miniconda3/bin/salmon index \
-t /path/to/file_gtf.fa \
-i /path/to/Indexed_Salmon \
--decoys /path/to/decoy.txt -k 31

salmon quant -i /path/to//VdLS17_Indexed_Salmon \
--libType A -1 path/to/R1_2_forward_paired.fastq.gz \
-2 /path/to/R2_2_reverse_paired.fastq.gz \
-o /path/to//Salmon_XS_30mpi_transcripts.salmon \
--validateMappings

I get the following error:

[2024-03-27 21:24:43.090] [jLog] [info] building index [2024-03-27 21:24:43.091] [jointLog] [error] The decoy file /path/to//decoy.txt does not exist.

I gzip my transcript.fa and them used the gzip command. Might this affect the file?

I can get the quant.sf file in the end, but, is it accurate if Salmon does not detect the decoy?

ADD REPLY
0
Entering edit mode

/path/to//decoy.txt does not exist.

Did you check to make sure the file is there? Also it is possible that by simply cutting the names after first space you may have lost other parts of the names.

Again decoys are recommended but not essential for salmon. Quant file you got should be usable.

ADD REPLY

Login before adding your answer.

Traffic: 1626 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6