TY - JOUR
T1 - Genetic Ancestry Estimates within Dutch Family Units and Across Genotyping Arrays
T2 - Insights from Empirical Analysis Using Two Estimation Methods
AU - Beck, Jeffrey J.
AU - Ahmed, Talitha
AU - Finnicum, Casey T.
AU - Zwinderman, Koos
AU - Ehli, Erik A.
AU - Boomsma, Dorret I.
AU - Hottenga, Jouke Jan
N1 - Funding Information: This study makes use of data generated by the Genome of the Netherlands Project. A full list of the investigators is available from www.nlgenome.nl (accessed on 19 July 2023). Funding for the project was provided by the Netherlands Organization for Scientific Research under award number 184021007, dated 9 July 2009 and made available as a Rainbow Project of the Biobanking and Biomolecular Research Infrastructure Netherlands (BBMRI-NL). The sequencing was carried out in collaboration with the Beijing Institute for Genomics (BGI). We would like to thank and acknowledge all the individuals and scientists who participated in the Genome of the Netherlands Project. We would also like to thank all members of twin families registered with the Netherlands Twin Register for their continued support of scientific research. Funding Information: Funding for this project was obtained from the Amsterdam Public Health methodology grants (2018) and the Avera Institute for Human Genetics, Sioux Falls, South Dakota (USA). Data collections were funded by the Netherlands Organization for Scientific Research (NWO) and The Netherlands Organization for Health Research and Development (ZonMW) grants 904-61-090, 985-10-002, 912-10-020, 904-61-193, 480-04-004, 463-06-001, 451-04-034, 400-05-717, Addiction-31160008, 016-115-035, 481-08-011, 400-07-080, 056-32-010, Middelgroot-911-09-032, OCW_NWO Gravity program –024.001.003, NWO-Groot 480-15-001/674, Center for Medical Systems Biology (CSMB, NWO Genomics), NBIC/BioAssist/RK (2008.024), Biobanking and Biomolecular Resources Research Infrastructure (BBMRI –NL, 184.021.007 and 184.033.111), X-Omics 184-034-019, Spinozapremie (NWO- 56-464-14192), European Community’s Fifth and Seventh Framework Program (FP5- LIFE QUALITY-CT-2002-2006, FP7- HEALTH-F4-2007-2013, grant 01254: GenomEUtwin, grant 01413: ENGAGE and grant 602768: ACTION), the European Research Council (ERC Starting 284167, ERC Consolidator 771057, ERC Advanced 230374), Rutgers University Cell and DNA Repository (NIMH U24 MH068457-06), the National Institutes of Health (NIH, R01D0042157-01A1, R01MH58799-03, MH081802, DA018673, R01DK092127-04, Grand Opportunity grants 1RC2 MH089951, and 1RC2 MH089995) and the Genetic Association Information Network (GAIN) of the Foundation for the National Institutes of Health. Publisher Copyright: © 2023 by the authors.
PY - 2023/7/1
Y1 - 2023/7/1
N2 - Accurate inference of genetic ancestry is crucial for population-based association studies, accounting for population heterogeneity and structure. This study analyzes genome-wide SNP data from the Netherlands Twin Register to compare genetic ancestry estimates. The focus is on the comparison of ancestry estimates between family members and individuals genotyped on multiple arrays (Affymetrix 6.0, Affymetrix Axiom, and Illumina GSA). Two conventional methods, principal component analysis and ADMIXTURE, were implemented to estimate ancestry, each serving its specific purpose, rather than for direct comparison. The results reveal that as the degree of genetic relatedness decreases, the Euclidean distances of genetic ancestry estimates between family members significantly increase (empirical p < 0.001), regardless of the estimation method and genotyping array. Ancestry estimates among individuals genotyped on multiple arrays also show statistically significant differences (empirical p < 0.001). Additionally, this study investigates the relationship between the ancestry estimates of non-identical twin offspring with ancestrally diverse parents and those with ancestrally similar parents. The results indicate a statistically significant weak correlation between the variation in ancestry estimates among offspring and differences in ancestry estimates among parents (Spearman’s rho: 0.07, p = 0.005). This study highlights the utility of current methods in inferring genetic ancestry, emphasizing the importance of reference population composition in determining ancestry estimates.
AB - Accurate inference of genetic ancestry is crucial for population-based association studies, accounting for population heterogeneity and structure. This study analyzes genome-wide SNP data from the Netherlands Twin Register to compare genetic ancestry estimates. The focus is on the comparison of ancestry estimates between family members and individuals genotyped on multiple arrays (Affymetrix 6.0, Affymetrix Axiom, and Illumina GSA). Two conventional methods, principal component analysis and ADMIXTURE, were implemented to estimate ancestry, each serving its specific purpose, rather than for direct comparison. The results reveal that as the degree of genetic relatedness decreases, the Euclidean distances of genetic ancestry estimates between family members significantly increase (empirical p < 0.001), regardless of the estimation method and genotyping array. Ancestry estimates among individuals genotyped on multiple arrays also show statistically significant differences (empirical p < 0.001). Additionally, this study investigates the relationship between the ancestry estimates of non-identical twin offspring with ancestrally diverse parents and those with ancestrally similar parents. The results indicate a statistically significant weak correlation between the variation in ancestry estimates among offspring and differences in ancestry estimates among parents (Spearman’s rho: 0.07, p = 0.005). This study highlights the utility of current methods in inferring genetic ancestry, emphasizing the importance of reference population composition in determining ancestry estimates.
KW - ADMIXTURE
KW - genetic ancestry estimation
KW - population structure
KW - principal components analysis (PCA)
KW - within-family analysis
UR - http://www.scopus.com/inward/record.url?scp=85166022367&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85166022367&partnerID=8YFLogxK
U2 - https://doi.org/10.3390/genes14071497
DO - https://doi.org/10.3390/genes14071497
M3 - Article
C2 - 37510400
SN - 2073-4425
VL - 14
SP - 1
EP - 16
JO - Genes
JF - Genes
IS - 7
M1 - 1497
ER -