Dataset for NGS
2
0
Entering edit mode
8.0 years ago

Hi,

I am doing a project in which I intend to compare the tools for finding snp,indel and cnv.I tried many tools and was successful in running few of them.The problem now is that I want datasets that I can run so that I will surely find indel,snp or cnvs for instance.As I have limited hardware like I have i5 processor with 2gb ram so large files may take time. Can you suggest some datasets which are small in size and I will surely find the indels,.. in them so that I may perform the analysis and may compare tools on small dataset?

SNP cnv indel dataset • 1.4k views
ADD COMMENT
0
Entering edit mode
8.0 years ago
mastal511 ★ 2.1k

Try some data from the 1000 genomes project.

If you have limited computing resources, you could try looking at data from a single chromosome at a time maybe.

ADD COMMENT
0
Entering edit mode
8.0 years ago
rbagnall ★ 1.8k

You could use some of the ICR142 NGS validation series: "The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 730 sites; 409 sites with variants and 321 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 286 indel variants and 275 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance.

The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332. (Requires submission of an application form)"

ADD COMMENT

Login before adding your answer.

Traffic: 1972 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6