Question

How to do NGS Analysis of a Particular Gene?

0

Entering edit mode

6.0 years ago

user5212 • 0

I have reads (fastq data) of a particular gene from the human genome hg38. I also have the genbank (GBK) file and the fasta file of the gene of interest. I want to know what variants map to each exon of that gene and the coverage of each exon of that gene.

For example, I want to be able to say: variant T > A occurs at hg38 reference position chr12:88813734 which is exon 1/14 of the gene, variant TGGGA > TA occurs at hg38 reference position chr12:88847259 which is exon 5/14 of the gene, and so on. Exon 1 has 10000X coverage, Exon 2 has 11500X coverage, and so on.

Is there a variant caller that does this? If not, what protocol would you use to do this kind of analysis?

I understand not all variants reported will map to the exons. Is there a variant caller that will tell me the IVS that the variant maps to. For example, variant G > C occurs at hg38 reference position chr12:88846432 which is intron7 of the gene (IVS-7).

dna-seq SNP gene • 1.2k views

ADD COMMENT • link updated 6.0 years ago by finswimmer 16k • written 6.0 years ago by user5212 • 0

1

Entering edit mode

If you are only interested in one specific gene there may be other techniques that would be more cost effective compared to NGS.

ADD REPLY • link 6.0 years ago by GenoMax 141k

score 2 · Answer 1 · 2018-04-12

2

Entering edit mode

6.0 years ago

finswimmer 16k

Hello,

Variant calling is just one of many steps you have to do to get your desired output. At minimum you first need to map and align your reads of the fastq file to your reference genome (e. g. with bwa) . Than you can do variant calling only for your region of interest (e. g. with GATK HaplotypeCaller or freebayes) . To get to know in which intron/exon the variant is located an annotation is required (e.g. with SnpSift/Snpeff). The variant caller also reports the read depth at this position. If you need it for every region you have to use a tool like bedtools.

You see there is a lot work to do. But don't hesitate to ask a concrete question on that way.

fin swimmer

ADD COMMENT • link 6.0 years ago by finswimmer 16k

0

Entering edit mode

Thanks your help. As a brief follow-up question, I know my gene is located on Chromosome 12. Instead of mapping my reads to the entire reference genome (Hg38), can I simply map and align my reads to Chromosome 12 of the Hg38 reference genome?

ADD REPLY • link 6.0 years ago by user5212 • 0

1

Entering edit mode

This is possible. But I wouldn't recommend it. Dependig on the method used for library preparation you always have sequences that doesn't belong to your target region. It's better to map these reads to their real origin. If you doesn't provide its reference it might happen that the reads are mapped wrong.

You don't have to be scared about the time needed for mapping against the whole genome. The time depends on the number of reads you have and their length and not on the reference.

fin swimmer

ADD REPLY • link 6.0 years ago by finswimmer 16k