Alignment of the partial genome sequencing
2
0
Entering edit mode
6.4 years ago
micro32uvas ▴ 10

Hello,

I have a pretty basic question. I hope to get help in this matter.

We did this whole genome targeted sequencing earlier. Now we need only a partial chr1 sequence aligned with the reference genome, instead of doing the alignment and mapping all over again.

I took chr1 of the ref genome as a complete reference and alignment the mate pair fastq files. but I am stuck at this point again. Can anyon help.

Its chicken genome.

Aini

next-gen sequencing alignment • 1.9k views
ADD COMMENT
0
Entering edit mode

Do you want to align only 1 chr1 sequence to the reference genome? Is that your question ?

ADD REPLY
0
Entering edit mode

yes, infact, I have a specific location to align that is associated with my trait of interest e.g. 1:67,246,039- 67,350,164. And I have raw mate pair fastq files.

ADD REPLY
0
Entering edit mode

I have created a bam file using bowtie2. Ref file is chr1 of the reference genome. with 2 mate pair fastq files.

here is the same result:

@HD VN:1.0 SO:unsorted @SQ SN:NC_006088.4 LN:196202544 @PG ID:bowtie2 PN:bowtie2 VN:2.1.0 SRR455103.1 83 NC_006088.4 9011 7 100M = 8970 -141 AAAGATACCCAAAAGAACCGACTCACCCCTCCCGGCGGCAGCGCGCAGGCAGAGCCGAGCTGAGCAATGTGTCCCTCCCCGTTCGTAGCGCCGCGCCGCC #######################################################?A>A8?BAAA@8GBG>DD>DD<>GGHHHHHHHHGHHHDGHDDHGH AS:i:-6 XS:i:-6 XN:i:0 XM:i:3 XO:i:0 XG:i:0 NM:i:3 MD:Z:17T17T0A63 YS:i:-10 YT:Z:CP SRR455103.1 163 NC_006088.4 8970 7 100M = 9011 141 GCAATGGCAAAGCCTTTATTTAAGGAATAGGCACACGAGGGAAAGATACCCAAAAGAATCGACTCACCCCTCCCGGCAGCAGCGCGCAGGCAGAGCCGAG IIHA?BBBGGGDGGGGD?GDD;@GGGGGIDIE?IIIIIGDIHIHGI@GIGID>GBGGDDDGGGDGGGDGFIIFHDEDFGE88>=A?A?HBBH>?A##### AS:i:-10 XS:i:-34 XN:i:0 XM:i:2 XO:i:0 XG:i:0 NM:i:2 MD:Z:24A51T23 YS:i:-6 YT:Z:CP

But i cannot extract the coordinates as you suggested: Antonio R. Franco

samtools view accepted_hits.bam chr1:67246039-67350164 > mylectures.sam

I dont understand, what to do at this point

ADD REPLY
0
Entering edit mode

You are doing right in reding these instructions http://www.htslib.org/doc/samtools.html. samttools view had some requirements, like the presence of the header. If you want, share your bam file with me through dropbox, mega or the like, and I give it a look. My address is arfranco (@) uco.es

ADD REPLY
1
Entering edit mode
6.4 years ago

You have several choices

  1. You can map all of your lectures to the whole reference genome, and thus, interrogate with samtools which of your lectures map to this region. samtools view accepted_hits.bam chr1:67246039-67350164 > mylectures.sam This order assumes that your bam mapped file is named accepted_hits.bam, and that the name of the chromosome in the reference genome is chr1. You can also see this in a graphic genome browser such as IGV, IGB or the like

  2. You can also extract this region only and map using it as a reference

ADD COMMENT
0
Entering edit mode

I got your file..

Your bam file is right. It contains the header. I see you used bowtie2 with samtools view -H SRR733103.bam

with samtools view SRR733103.bam | cut -f3 | uniq -c

I can see that you have 3777516 mapped reads to the reference sequence named NC_006088.4 (is not chr1, you need to use the actual name you used for the reference sequence)

Then you need to index your BAM with samtools index SRR733103.bam

Once you have your BAM file indexed you can run samtools view -h NC_006088.4:67246039-67350164 > selected.sam

In selected.sam you have all the mapped sequences to NC_006088 included in the range of the indicated coordinates, and will include the header into the new sam file. You don't actually need to use a bam file for that as selected.sam file is small

Hope this helps

ADD REPLY
0
Entering edit mode

It worked perfectly. However, now that i have this partial desired mapped sequence, how can i look for the potential snps?? I can call SNPs but what about annotation. I am not clear about what possible should be done afterwards.

I know its little out of this initial questions' scope, but it is just the next step afterwards. I hope i get an elaborate response as before :)

ADD REPLY
0
Entering edit mode

You can look for SNP under different approaches

  1. Look for samtools mpileup

  2. Load the sam file along the reference genome in a visual genomic browser like IGV o IGB. The same can be done with samtools tview but in not such as elegant way

  3. Look for the way of getting a VCF (variant file) using samtools

ADD REPLY
0
Entering edit mode
6.4 years ago

with samtools view selected.sam | wc -l you count the number of mapped reads to this range,. There are 285 mapped reads

ADD COMMENT

Login before adding your answer.

Traffic: 2188 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6