Question

Closed:Whole Exome Sequence Data Analysis, VCF file, help??

0

Entering edit mode

5.6 years ago

kelseyca • 0

Hello!

This is my first post here.

I am attempting to run some analyses on some Human whole exome germ line sequence data, there are 120 samples split into two groups (two different conditions) and I would like to run them to detect variants. This is my first time working with exome data so sorry if this is a noob question.

I was given the files in VCF.gz format, and I uploaded them into galaxy with the intention of running an exome seq pipeline. However, I am unable to do so, as the files have been uploaded into galaxy in tabular format and the pipeline requires fastq. I tried to convert the file format but couldn't do so.

I previewed the VCF file to see what was inside and it looks like this: (first two lines pasted)

1   2   3   4   5   6   7   8   9   10
chr1    861368  .   CG  C   1020.73 .   AC=1;AF=0.500;AN=2;BaseQRankSum=3.298;DP=146;FS=165.905;MLEAC=1;MLEAF=0.500;MQ=86.82;MQ0=0;MQRankSum=-1.499;QD=6.99;RPA=3,2;RU=G;ReadPosRankSum=2.010;SOR=5.577;STR GT:AD:DP:GQ:PL  0/1:54,67:145:99:1058,0,653
chr1    874544  .   AG  A   971.73  .   AC=1;AF=0.500;AN=2;BaseQRankSum=1.050;DP=60;FS=46.340;MLEAC=1;MLEAF=0.500;MQ=89.05;MQ0=0;MQRankSum=-2.496;QD=16.20;RPA=4,3;RU=G;ReadPosRankSum=-0.361;SOR=2.332;STR GT:AD:DP:GQ:PL  0/1:18,40:59:99:1009,0,175
...

Are exome VCF files normally supposed to look like this?

If I need to run this file, can I skip processing steps and just skip to something like GatK since the file is already in VCF format?

Thanks! Sorry for all the questions!

exome sequencing • 155 views

ADD COMMENT • link updated 5.6 years ago by Tm ★ 1.1k • written 5.6 years ago by kelseyca • 0