Question

vcfR for Whole Genome Data

0

Entering edit mode

5.2 years ago

jaafari.omid ▴ 80

Dears all, Actually I have a generated vcf file by samtools pipeline and before doing the filtration of SNPs, I am going to check their quality, mapping and depth using vcfR. But at the first step when I want to read the vcf file I am facing with an error. Here is the command that I used:

T.vcf<- read.vcfR("species.vcf)

And here is the error I have:

Processed variant 136000Error in .read_body_gz(file, stats = stats, nrows = nrows, skip = skip, : long vectors not supported yet: memory.c:1668

I will be grateful if anybody can help me with this error.

Regards, Omid

snp genome R VCF vcfR • 1.7k views

ADD COMMENT • link 5.2 years ago by jaafari.omid ▴ 80

score 1 · Answer 1 · 2019-02-20

1

Entering edit mode

5.2 years ago

zx8754 11k

Use the right tool for the job, R is not great for whole genome data. Maybe use bcftools for filtering, then use R for further analysis.

ADD COMMENT • link 5.2 years ago by zx8754 11k

0

Entering edit mode

Thanks for the answer, So you mean I can use the vcfR package for only the vcf files which are small in size? because I have used this package for a vcf file generated for GBS data but for the WGS data I have this problem.

ADD REPLY • link 5.2 years ago by jaafari.omid ▴ 80

0

Entering edit mode

No, he means to do everything outside of R. Take a look at the functions available in BCFtools. By the way, bcftools query allows you to easily output data in tabular format, which you could then further analyse in R.

ADD REPLY • link 5.2 years ago by Kevin Blighe 87k