splice junction information with hisat2
1
1
Entering edit mode
6.0 years ago

I am newbie to linux and NGS. Can anyone help me out how to get information about splice junctions using HISAT2?

The command I'm using is giving information in single file about alignment in SAM format. The command is as follows:

./hisat2 -p 64 --max-intronlen 10000 -x /data/memona/hisat2-2.1.0/hisat_index -1 /data/memona/SRR959590_A_1P.fq -2 /data/memona/SRR959590_A_2P.fq -S /data/memona/results/hisat_align.sam &
hisat2 • 4.8k views
ADD COMMENT
0
Entering edit mode

HI, i am also using same mapping strategy, and getting error "index file does not exist"

(base) macserver@MACs-Mac-Pro hisat2_alingment % hisat2 -p 8 -x /Users/macserver/Desktop/buffalo_refence_genome/buf_rna_index -1 /Volumes/AshishIVRI/OLD/ashishanalysis/Trim_F/Gr_11_R1C.fastq -2 /Volumes/AshishIVRI/OLD/ashishanalysis/Trim_F/Gr_11_R2C.fastq -s /Users/macserver/Desktop/hisat2_alingment/Gr_11.sam    
(ERR): "/Users/macserver/Desktop/buffalo_refence_genome/buf_rna_index" does not exist
Exiting now ... 

please suggest possible solution

ADD REPLY
0
Entering edit mode

What is the output of

ls -lh /Users/macserver/Desktop/buffalo_refence_genome/buf_rna_index*
ADD REPLY
4
Entering edit mode
6.0 years ago
Juke34 8.5k

Hi blooming.daisy333,

It is explained in the manual. You have to use the option --novel-splicesite-outfile.
Update:
Be careful there is an error how left splice sites are reported as I mentioned here
The splice sites are correctly reported but the format they use is not necessarily straightforward.

ADD COMMENT
0
Entering edit mode

Hi Juke,

thanks for the kind guidance. however, im still unable to get the splice junction information despite of using --novel-splicesite-outfile command.

here is the command that i used. kindly point out the mistake and suggest the solution:

./hisat2 --np 0 --pen-noncansplice 10000000 --min-intronlen 20 --max-intronlen 10000 --novel-splicesite-outfile /data/memona/hisat2-2.1.0/result/ --rna-strandness RF --dta -p64 --summary-file -x /data/memona/hisat2-2.1.0/hisat_index -1 /data/memona/Trimmomatic-0.36/SRR959591_E_1P.fq -2 /data/memona/Trimmomatic-0.36/SRR959591_E_2P.fq -S /data/memona/hisat2-2.1.0/result/hisat_align.sam

thank you so much

ADD REPLY
2
Entering edit mode

You’re welcome. The problem is you provided a path but no file name. Instead of “/data/memona/hisat2-2.1.0/result/” do “/data/memona/hisat2-2.1.0/result/splice_sites.tsv” and it should be fine.

ADD REPLY
1
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

thanks for the kind guidance. yes it has produced the output but is giving only 3 fields given below:;

chr1    329728  329839  -
chr1    330066  330757  -
chr1    581256  581357  +

while not mentioning the canonical/non canonical status and doner accepter nucleotides which are important to classify the site.

can you please share the script/tool or command that you used to extract the nucleotide information of doner and accepter site like AT/AC, GT/AT, CT/GC etc...

further is it important to build the index with --ss and --exon options to determine the splice sites...???? or it is OK if the index is built without using these options???

ADD REPLY
1
Entering edit mode

I don’t know the use of those options but for sure it is fine without to determine the spicing sites.

For the extraction of spicing sites I have used fasta_domainExtractor_JD.pl from the NBIS/GAAS repository but it is not really adapted for what you want. You can get inspiration from this script to implement what you reallly want.

ADD REPLY

Login before adding your answer.

Traffic: 2559 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6