Entering edit mode
5.0 years ago
krt
•
0
I need to separate only chromY from my "wgs.vcf.gz", I tried the Tabix solution posted here: How to split vcf file by chromosome? However it resulted in a final chrY.vcf file with 0kb size. I've successfully done it for chr2.vcf getting a file with 73,000kb size. What would be the best way to do it for chrY?
what was the command line ? are you sure you have chrY (or just 'Y' ?) in the VCF ? did you ask tabix to also print the header ?
Hi Pierre, I am not sure having the chrY in VCF I just supposed it because it's a WGS file from dantelabs. I used:
or
Tabix listed from chr1 to chr22 plus chrX and chrY
I did a re-run of: tabix myvcf.vcf.gz chrY > chrY and now I have got a file with 649kb. Does it seam alright?
Look at the file and see if it worked, why are you asking us?
Because I have no idea how many snps i should expect from this type of test. I don't know if 649kb is a reasonable size. If you can't answer, just ignore it.
but, as said @jared.andrews07, why don't you look in the vcf and check that there are only variants mapped on chrY ? why don't you check that this number if the number expected from your original vcf ?
We have no idea if that's an appropriate size or not - we know nothing about the size of your original VCF. We literally can't answer that - only you can. I'm honestly not trying to be a jerk, I just don't know how you expect us to verify something like that. Check the file and see if they're all chrY records. If so, it worked. If not, it didn't.