Closed:Mapping sequencing reads to a reference genome with Bowtie2: a step-by-step guide
0
0
Entering edit mode
8.2 years ago

Sequencing read mapping is a key step of the next generation sequencing data processing. It allows to find locations of the newly sequenced reads and align them with respect to a reference sequence (e.g. reference sequence, transcriptome, de novo assembly). Both RNA-seq analysis and variants calling require read mapping to be as accurate as possible. In this tutorial, we explain how to do read mapping with Bowtie2, one of the most popular tools for read alignment. We also explain how to fine-tune some of the Bowtie2 parameters in order to achieve the highest sensitive of the mapping. We continue with the data from the study of a multi-drug resistance in Mycobacterium tuberculosis.

When researchers explore sequenced DNA, in many casesthey lookfor the differences between newly sequenced samples and genomes of the same - or in evolutionary terms not very distant - species that have previously been sequenced and assembled. A genome against which new samples are tested is called a reference genome. To identify the differences in DNA, each read of the newly sequenced sample is aligned to a fragment of a reference genome. The differences between reads and corresponding fragments of a reference genome can be used to explore intraspecific or interspecific variability.

In principle, DNA differences can be grouped into four broad classes:

  • substitutions (SVN)
  • small insertions or deletions (indels)
  • copy number variants (CNV)
  • large structural variants (SV)

Normally, SVNs and indels are collectively called SNPs and are the most common types of DNA difference used for exploration of intraspecific and interspecific variability. Insertions, deletions and inversions greater than 1000bp (sometimes even 50bp is considered as a threshold) are normally classified as CNVs and SVs. There is a range of CVN and SV types, but the most common ones are deletions, novel sequence insertions, mobile-element insertions, tandem and interspersed segmental duplications, inversions and translocations of elements >1Kb (50bp) in size.

Explore more on this here

insideDNA genome Sequencing read-mapping • 4.0k views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6