Question

Alignment of Query-RNA Seq

0

Entering edit mode

5.3 years ago

skevalkumar • 0

I am trying to align query sequence with RNA-Seq data (100bp PE). I have tried Bowtie2 and BWA. However, it did not give me reads that are matching or mismatched with query sequence. (I am not interested in aligning RNA-Seq data with Reference genome).

RNA-Seq alignment • 1.3k views

ADD COMMENT • link 5.3 years ago by skevalkumar • 0

2

Entering edit mode

(I am not interested in aligning RNA-Seq data with Reference genome).

That is fine but why are you not using the query sequence as your reference to make an index? Then align your data against it.

Note: Doing this always runs into risk of having some reads align in locations that they did not originate from.

ADD REPLY • link 5.3 years ago by GenoMax 142k

0

Entering edit mode

I did indexing of my query sequence and tried (--end-to-end and --local parameters) to align with RNA-Seq data. However, overall alignment rate was 0.00%.

ADD REPLY • link 5.3 years ago by skevalkumar • 0

0

Entering edit mode

What is the length of the query sequence?

ADD REPLY • link 5.3 years ago by GenoMax 142k

0

Entering edit mode

length of query sequence -2600bp

ADD REPLY • link 5.3 years ago by skevalkumar • 0

1

Entering edit mode

I suggest that you try bbmap.sh from BBTools (https://sourceforge.net/projects/bbmap/ ). Something like this:

bbmap.sh -Xmx10g threads=4 in1=R1.fq.gz in2=R2.fq.gz out=file.bam maxindel=2000 intronlen=10 ambig=random ref=your_query.fa mappedonly=t

This will only write mapped reads to the bam file (this will require samtools to be in your path otherwise SAM format will be used). If you only want to see how many reads are aligning then omit out. All the stats will still be written to STDERR.

Was there no alignment even if you used command defaults for bwa and bowtie2?

ADD REPLY • link 5.3 years ago by GenoMax 142k

0

Entering edit mode

There was no alignment with bowtie2 (all default parameters with endtoend and local) and bwa mem.

ADD REPLY • link 5.3 years ago by skevalkumar • 0

0

Entering edit mode

Hi, I have tried bbmap.sh and it shows 0.0011% mapped reads. I am reading bbmap reference guide (it is new to me.) thank you for help. Please give me more suggestions.

ADD REPLY • link 5.3 years ago by skevalkumar • 0

1

Entering edit mode

Not something you want to hear but here goes :

Your data only has a tiny fraction of reads that map to the query you have
You could try mapping individual reads (R1 or R2) as single-end data to the query (remove in1= and in2=, just use in= with R1 and R2 files above command line) and see if the mapping improves (some other explanation can be explored there, if you see a good number of reads aligning)

Take 10 reads (from R1 and R2) and blast them at NCBI to confirm that you are looking at the correct sample set and the reads are aligning to right genome. Would eliminate the possibility that you have contamination of some kind.

ADD REPLY • link 5.3 years ago by GenoMax 142k

0

Entering edit mode

It is showing almost similar result (0.0008%). I have checked few reads in NCBI. It is matching with my plant species.

ADD REPLY • link 5.3 years ago by skevalkumar • 0

0

Entering edit mode

The the only plausible explanation is this:

Your data only has a tiny fraction of reads that map to the query you have

ADD REPLY • link 5.3 years ago by GenoMax 142k