The effect of sample size and relatedness in target sample in polygenic risk score
1
0
Entering edit mode
5.1 years ago

Hi every one

I'm using PRSice v1.25 to calculate the polygenic risk scores in my sample ( 1400 twins). My phenotype is wellbeing ( quantitative) and I'm using the summary stats of a very large GWAS as my base data (https://www.nature.com/articles/s41588-018-0320-8). The problem is that my PRS seems to be a little inflated. I have two questions 1. Does the relatedness between individuals in the target sample influence the PRS? 2. Does the target sample size influence the PRS? ( how?)

Thank you!

polygenic risk score PRS PRScise • 3.0k views
ADD COMMENT
0
Entering edit mode

In general, i saw people remove related samples from the analysis.

Sample size of GWAS or PRS bins ?

ADD REPLY
0
Entering edit mode

Yes, If you want to run a GWAS or an association study you need to remove the related samples, but I am not sure about the PRS, if it needs to remove the related samples or not. The sample size of the PRS ( the target sample).

ADD REPLY
2
Entering edit mode
5.0 years ago
Sam ★ 4.7k

You will definitely want to remove the related samples from your dataset, and I will suggest you move onto PRSice-2 instead of PRSice 1.25. We have made some major improvement of the software between the two iterations.

As for the effect of sample size, you might want to read Dudbridge's paper

ADD COMMENT
0
Entering edit mode

Thanks Sam! I divided my twins into two groups, each group includes independent samples and I calculated the PRS again for each group. The R-squared from the whole sample (with related individuals_twins) PRS seems to be the average of the two groups. AND. Do we need to include covariates such as age, sex, and principal components while calculating PRS using PRSice?

ADD REPLY
1
Entering edit mode

The inclusion of covaraites will have no effect on the PRS calculated. However, it will affect the R2 and p-value, therefore might influence the "best" p-value threshold. It is generally a good idea to include the relevant covariates when performing PRS just so that the R2 and p-value are more relevant

ADD REPLY
0
Entering edit mode

Hi~Sam, I also want to know whether it is feasible to use 446 subjects as target data. After I removed related samples from my data, I only have 446 subjects.

ADD REPLY
1
Entering edit mode

Yes, you can definitely try. You can maybe also look at PRS power calculation tools such as AVENGEME

ADD REPLY
0
Entering edit mode

How to find the risk of indivudal sample in the target data

ADD REPLY

Login before adding your answer.

Traffic: 2537 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6