Software For Inferring Population Structure
4
9
Entering edit mode
12.2 years ago

Can we make a list of software for inferring Population Structure from genotype or sequencing data?

This type of software are used to infer how a set of individuals can be subdivided into groups, given their genotype or genome. A typical example is the classification of Kenyan trush samples into three different populations, described in the original paper of the Structure software. In this case, a software to infer population structure has been used to determine whether the samples collected belonged to a single populations, or to different sub-populations, and to identify outliers.

Kenyan trush example from structure paper

(Kenyan trush example from structure paper)

You can also find a lot of example in the blog of the Dodecad project, and on Dienekes's blog.

I think that Structure is the most popular software in this field, but a lot of new options have been published recently... Can you share your list of software, and give our thoughts on what is your favorite?

population population genetics visualization • 5.7k views
ADD COMMENT
7
Entering edit mode
12.2 years ago
1234Jc4321 ▴ 450

There is also a software called ADMIXTURE.

D.H. Alexander, J. Novembre, and K. Lange. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19:1655–1664, 2009.

ADD COMMENT
1
Entering edit mode

I like ADMIXTURE because it is super fast for large datasets. Also check out Distruct that makes really nice plots from both STRUCTURE and admixture.

ADD REPLY
4
Entering edit mode
12.2 years ago
Botond Sipos ★ 1.7k

Structurama implementing the method described in:

Huelsenbeck JP, Andolfatto P. - Inference of population structure under a Dirichlet process model. Genetics. 2007 175(4):1787-802.

ADD COMMENT
4
Entering edit mode
12.2 years ago

My colleague has done some extensive work on this with a Puerto Rican population. Individuals originated from three ancestral populations: European settlers, native Taíno Indians, and West Africans. He used two programs: STRUCTURE 2.2 (Falush et al. 2003; Pritchard et al. 2000) and IAE3CI (Tsai et al. 2005; Parra et al. 2001), then the EIGENSTRAT (Price et al. 2006) program was implemented in HelixTree (Golden Helix, Bozeman, MT, USA) to calculate the principal components based on the genotypes of 100 ancestry informative markers in the population.

Added in edit 27 Jan 2012: Keep in mind that these calculations (of admixture) work best when one knows the frequencies of informative markers in each of the ancestral populations.

Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587.[?] Parra EJ, Kittles RA, Argyropoulos G, et al. (2001) Ancestral proportions and admixture dynamics in geographically defined African Americans living in South Carolina. Am J Phys Anthropol 114:18–29.[?] Price AL, Patterson NJ, Plenge RM, et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904–909.[?] Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959.[?] Tsai HJ, Choudhry S, Naqvi M, et al. (2005) Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations. Hum Genet 118:424–433.[?]

ADD COMMENT
3
Entering edit mode
12.2 years ago
Amr ▴ 160

I have used eBURST in the past which can be quite useful although it takes only takes MLST data. it was designed to visualise population structure in bacteria - definately worth a look:

http://eburst.mlst.net/

alt text

ADD COMMENT

Login before adding your answer.

Traffic: 2389 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6