TY - JOUR
T1 - Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores
AU - Weissbrod, Omer
AU - Kanai, Masahiro
AU - Shi, Huwenbo
AU - Gazal, Steven
AU - Peyrot, Wouter J.
AU - Khera, Amit V.
AU - Okada, Yukinori
AU - Matsuda, Koichi
AU - Yamanashi, Yuji
AU - Furukawa, Yoichi
AU - Morisaki, Takayuki
AU - Murakami, Yoshinori
AU - Kamatani, Yoichiro
AU - Muto, Kaori
AU - Nagai, Akiko
AU - Obara, Wataru
AU - Yamaji, Ken
AU - Takahashi, Kazuhisa
AU - Asai, Satoshi
AU - Takahashi, Yasuo
AU - Suzuki, Takao
AU - Sinozaki, Nobuaki
AU - Yamaguchi, Hiroki
AU - Minami, Shiro
AU - Murayama, Shigeo
AU - Yoshimori, Kozo
AU - Nagayama, Satoshi
AU - Obata, Daisuke
AU - Higashiyama, Masahiko
AU - Masumoto, Akihide
AU - The Biobank Japan Project
AU - Koretsune, Yukihiro
AU - Martin, Alicia R.
AU - Finucane, Hilary K.
AU - Price, Alkes L.
N1 - Funding Information: We thank A. Schoech and C. Márquez-Luna for helpful discussions. This research was conducted using the UK Biobank resource under application no. 16549 and was funded by the National Institutes of Health (NIH; grant nos. U01 HG009379, U01 HG012009, R37 MH107649, R01 MH101244 and R01 HG006399). M.K. was supported by a Nakajima Foundation Fellowship and the Masason Foundation. W.J.P. was supported by an NWO Veni grant (no. 91619152). A.R.M. was supported by the National Institute of Mental Health (grant no. K99/R00MH117229). H.K.F. was supported by E. and W. Schmidt. A.V.K. was supported by grants (nos. 1K08HG010155 and 1U01HG011719) from the National Human Genome Research Institute and a sponsored research agreement from IBM Research. Y.O. was supported by JSPS KAKENHI (grant nos. 19H01021 and 20K21834) and AMED (grant nos. JP21km0405211, JP21ek0109413, JP21ek0410075, JP21gm4010006 and P21km0405217) and JST Moonshot R&D (grant nos. JPMJMS2021 and JPMJMS2024). Computational analyses were performed on the O2 High-Performance Compute Cluster at Harvard Medical School. Funding Information: We thank A. Schoech and C. Márquez-Luna for helpful discussions. This research was conducted using the UK Biobank resource under application no. 16549 and was funded by the National Institutes of Health (NIH; grant nos. U01 HG009379, U01 HG012009, R37 MH107649, R01 MH101244 and R01 HG006399). M.K. was supported by a Nakajima Foundation Fellowship and the Masason Foundation. W.J.P. was supported by an NWO Veni grant (no. 91619152). A.R.M. was supported by the National Institute of Mental Health (grant no. K99/R00MH117229). H.K.F. was supported by E. and W. Schmidt. A.V.K. was supported by grants (nos. 1K08HG010155 and 1U01HG011719) from the National Human Genome Research Institute and a sponsored research agreement from IBM Research. Y.O. was supported by JSPS KAKENHI (grant nos. 19H01021 and 20K21834) and AMED (grant nos. JP21km0405211, JP21ek0109413, JP21ek0410075, JP21gm4010006 and P21km0405217) and JST Moonshot R&D (grant nos. JPMJMS2021 and JPMJMS2024). Computational analyses were performed on the O2 High-Performance Compute Cluster at Harvard Medical School. Publisher Copyright: © 2022, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2022/4
Y1 - 2022/4
N2 - Polygenic risk scores suffer reduced accuracy in non-European populations, exacerbating health disparities. We propose PolyPred, a method that improves cross-population polygenic risk scores by combining two predictors: a new predictor that leverages functionally informed fine-mapping to estimate causal effects (instead of tagging effects), addressing linkage disequilibrium differences, and BOLT-LMM, a published predictor. When a large training sample is available in the non-European target population, we propose PolyPred+, which further incorporates the non-European training data. We applied PolyPred to 49 diseases/traits in four UK Biobank populations using UK Biobank British training data, and observed relative improvements versus BOLT-LMM ranging from +7% in south Asians to +32% in Africans, consistent with simulations. We applied PolyPred+ to 23 diseases/traits in UK Biobank east Asians using both UK Biobank British and Biobank Japan training data, and observed improvements of +24% versus BOLT-LMM and +12% versus PolyPred. Summary statistics-based analogs of PolyPred and PolyPred+ attained similar improvements.
AB - Polygenic risk scores suffer reduced accuracy in non-European populations, exacerbating health disparities. We propose PolyPred, a method that improves cross-population polygenic risk scores by combining two predictors: a new predictor that leverages functionally informed fine-mapping to estimate causal effects (instead of tagging effects), addressing linkage disequilibrium differences, and BOLT-LMM, a published predictor. When a large training sample is available in the non-European target population, we propose PolyPred+, which further incorporates the non-European training data. We applied PolyPred to 49 diseases/traits in four UK Biobank populations using UK Biobank British training data, and observed relative improvements versus BOLT-LMM ranging from +7% in south Asians to +32% in Africans, consistent with simulations. We applied PolyPred+ to 23 diseases/traits in UK Biobank east Asians using both UK Biobank British and Biobank Japan training data, and observed improvements of +24% versus BOLT-LMM and +12% versus PolyPred. Summary statistics-based analogs of PolyPred and PolyPred+ attained similar improvements.
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85128487891&origin=inward
UR - https://www.ncbi.nlm.nih.gov/pubmed/35393596
UR - http://www.scopus.com/inward/record.url?scp=85128487891&partnerID=8YFLogxK
U2 - https://doi.org/10.1038/s41588-022-01036-9
DO - https://doi.org/10.1038/s41588-022-01036-9
M3 - Article
C2 - 35393596
SN - 1061-4036
VL - 54
SP - 450
EP - 458
JO - Nature genetics
JF - Nature genetics
IS - 4
ER -