How to sort VCF "chr1, chr2..."
1
0
Entering edit mode
7.2 years ago
scchess ▴ 640

I have a VCF file ordered by "chr1, chr10, chr11 ...". I would like to sort it to "chr1, chr2, chr3 ..."

I tried Picard's SortVcf but it gives me back the orders "chr1, chr10, chr11 ...". It's not something I want.

Q: What's the easiest way to sort my VCF file by "chr1, chr2, chr3 ..."?

vcf • 7.1k views
ADD COMMENT
11
Entering edit mode
7.2 years ago

One way is to use vcf-sort from vcftools http://vcftools.sourceforge.net/

vcf-sort your.vcf > sorted.vcf

The second way is to use grep and sort:

grep "^#" your.vcf > sorted.vcf && grep -v "^#" your.vcf | \
  sort -V -k1,1 -k2,2n >> sorted.vcf

The first grep select header, the second grep select data, sort sorts by the first column and since it is alphanumerical it sorts it -V in version sort order. Some systems do not have -V in sort, then use

grep "^#" your.vcf > sorted.vcf && grep -v "^#" your.vcf | \
  awk '{tmp=$1;sub(/chr/,"",tmp);print tmp,$0}' | \
  sort -k1,1n -k3,3n | \
  awk '{tmp=$0;sub(/([^ ]+ +){1}/,"",tmp);print tmp}' >> sorted.vcf

By the way do you have chrX and chrY and where you want to have it?

ADD COMMENT

Login before adding your answer.

Traffic: 2639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6