Closed:Whole Exome Sequence Data Analysis, VCF file, help??
2
0
Entering edit mode
5.6 years ago
kelseyca • 0

Hello!

This is my first post here.

I am attempting to run some analyses on some Human whole exome germ line sequence data, there are 120 samples split into two groups (two different conditions) and I would like to run them to detect variants. This is my first time working with exome data so sorry if this is a noob question.

I was given the files in VCF.gz format, and I uploaded them into galaxy with the intention of running an exome seq pipeline. However, I am unable to do so, as the files have been uploaded into galaxy in tabular format and the pipeline requires fastq. I tried to convert the file format but couldn't do so.

I previewed the VCF file to see what was inside and it looks like this: (first two lines pasted)

1   2   3   4   5   6   7   8   9   10
chr1    861368  .   CG  C   1020.73 .   AC=1;AF=0.500;AN=2;BaseQRankSum=3.298;DP=146;FS=165.905;MLEAC=1;MLEAF=0.500;MQ=86.82;MQ0=0;MQRankSum=-1.499;QD=6.99;RPA=3,2;RU=G;ReadPosRankSum=2.010;SOR=5.577;STR GT:AD:DP:GQ:PL  0/1:54,67:145:99:1058,0,653
chr1    874544  .   AG  A   971.73  .   AC=1;AF=0.500;AN=2;BaseQRankSum=1.050;DP=60;FS=46.340;MLEAC=1;MLEAF=0.500;MQ=89.05;MQ0=0;MQRankSum=-2.496;QD=16.20;RPA=4,3;RU=G;ReadPosRankSum=-0.361;SOR=2.332;STR GT:AD:DP:GQ:PL  0/1:18,40:59:99:1009,0,175
...

Are exome VCF files normally supposed to look like this?

If I need to run this file, can I skip processing steps and just skip to something like GatK since the file is already in VCF format?

Thanks! Sorry for all the questions!

exome sequencing • 155 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 2560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6