How to do NGS Analysis of a Particular Gene?
1
0
Entering edit mode
6.0 years ago
user5212 • 0

I have reads (fastq data) of a particular gene from the human genome hg38. I also have the genbank (GBK) file and the fasta file of the gene of interest. I want to know what variants map to each exon of that gene and the coverage of each exon of that gene.

For example, I want to be able to say: variant T > A occurs at hg38 reference position chr12:88813734 which is exon 1/14 of the gene, variant TGGGA > TA occurs at hg38 reference position chr12:88847259 which is exon 5/14 of the gene, and so on. Exon 1 has 10000X coverage, Exon 2 has 11500X coverage, and so on.

Is there a variant caller that does this? If not, what protocol would you use to do this kind of analysis?

I understand not all variants reported will map to the exons. Is there a variant caller that will tell me the IVS that the variant maps to. For example, variant G > C occurs at hg38 reference position chr12:88846432 which is intron7 of the gene (IVS-7).

dna-seq SNP gene • 1.2k views
ADD COMMENT
1
Entering edit mode

If you are only interested in one specific gene there may be other techniques that would be more cost effective compared to NGS.

ADD REPLY
2
Entering edit mode
6.0 years ago

Hello,

Variant calling is just one of many steps you have to do to get your desired output. At minimum you first need to map and align your reads of the fastq file to your reference genome (e. g. with bwa) . Than you can do variant calling only for your region of interest (e. g. with GATK HaplotypeCaller or freebayes) . To get to know in which intron/exon the variant is located an annotation is required (e.g. with SnpSift/Snpeff). The variant caller also reports the read depth at this position. If you need it for every region you have to use a tool like bedtools.

You see there is a lot work to do. But don't hesitate to ask a concrete question on that way.

fin swimmer

ADD COMMENT
0
Entering edit mode

Thanks your help. As a brief follow-up question, I know my gene is located on Chromosome 12. Instead of mapping my reads to the entire reference genome (Hg38), can I simply map and align my reads to Chromosome 12 of the Hg38 reference genome?

ADD REPLY
1
Entering edit mode

This is possible. But I wouldn't recommend it. Dependig on the method used for library preparation you always have sequences that doesn't belong to your target region. It's better to map these reads to their real origin. If you doesn't provide its reference it might happen that the reads are mapped wrong.

You don't have to be scared about the time needed for mapping against the whole genome. The time depends on the number of reads you have and their length and not on the reference.

fin swimmer

ADD REPLY

Login before adding your answer.

Traffic: 1511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6