TY - JOUR
T1 - A diverse ancestrally-matched reference panel increases genotype imputation accuracy in a underrepresented population
AU - Mauleekoonphairoj, John
AU - Tongsima, Sissades
AU - Khongphatthanayothin, Apichai
AU - Jurgens, Sean J.
AU - Zimmerman, Dominic S.
AU - Sutjaporn, Boosamas
AU - Wandee, Pharawee
AU - Bezzina, Connie R.
AU - Nademanee, Koonlawee
AU - Poovorawan, Yong
N1 - Funding Information: We thank CMKL University, PMU-C, and Center of Excellence in Medical Genomics, Chulalongkorn University for computational resources. We thank members of the Brugada consortium, the Thai Red Cross Society and Centre of Excellence in Clinical Virology, Faculty of Medicine, Chulalongkorn University for assistance with data collection. This research project was supported by a grant from the National Research Council of Thailand. J.M. has received research support from the Second Century Fund (C2F), Chulalongkorn University. S.J.J. has received research support from an Amsterdam UMC Doctoral Fellowship and the Junior Clinical Scientist Fellowship from the Dutch Heart Foundation (03-007-2022-0035). C.R.B. has received research support from the Dutch Heart Foundation (CVON Predict2), the Netherlands Organization for Scientific Research (VICI fellowship, 016.150.610), Fondation Leducq (17CVD02), and the EJP-RD LQTS-NEXT project (ZonMW project 40-46300-98-19009). Funding Information: This research project was supported by a grant from the National Research Council of Thailand. J.M. has received research support from the Second Century Fund (C2F), Chulalongkorn University. S.J.J. has received research support from an Amsterdam UMC Doctoral Fellowship and the Junior Clinical Scientist Fellowship from the Dutch Heart Foundation (03-007-2022-0035). C.R.B. has received research support from the Dutch Heart Foundation (CVON Predict2), the Netherlands Organization for Scientific Research (VICI fellowship, 016.150.610), Fondation Leducq (17CVD02), and the EJP-RD LQTS-NEXT project (ZonMW project 40-46300-98-19009). Publisher Copyright: © 2023, The Author(s).
PY - 2023/12/1
Y1 - 2023/12/1
N2 - Variant imputation, a common practice in genome-wide association studies, relies on reference panels to infer unobserved genotypes. Multiple public reference panels are currently available with variations in size, sequencing depth, and represented populations. Currently, limited data exist regarding the performance of public reference panels when used in an imputation of populations underrepresented in the reference panel. Here, we compare the performance of various public reference panels: 1000 Genomes Project, Haplotype Reference Consortium, GenomeAsia 100 K, and the recent Trans-Omics for Precision Medicine (TOPMed) program, when used in an imputation of samples from the Thai population. Genotype yields were assessed, and imputation accuracies were examined by comparison with high-depth whole genome sequencing data of the same sample. We found that imputation using the TOPMed panel yielded the largest number of variants (~ 271 million). Despite being the smallest in size, GenomeAsia 100 K achieved the best imputation accuracy with a median genotype concordance rate of 0.97. For rare variants, GenomeAsia 100 K also offered the best accuracy, although rare variants were less accurately imputable than common variants (30.3% reduction in concordance rates). The high accuracy observed when using GenomeAsia 100 K is likely attributable to the diverse representation of populations genetically similar to the study cohort emphasizing the benefits of sequencing populations classically underrepresented in human genomics.
AB - Variant imputation, a common practice in genome-wide association studies, relies on reference panels to infer unobserved genotypes. Multiple public reference panels are currently available with variations in size, sequencing depth, and represented populations. Currently, limited data exist regarding the performance of public reference panels when used in an imputation of populations underrepresented in the reference panel. Here, we compare the performance of various public reference panels: 1000 Genomes Project, Haplotype Reference Consortium, GenomeAsia 100 K, and the recent Trans-Omics for Precision Medicine (TOPMed) program, when used in an imputation of samples from the Thai population. Genotype yields were assessed, and imputation accuracies were examined by comparison with high-depth whole genome sequencing data of the same sample. We found that imputation using the TOPMed panel yielded the largest number of variants (~ 271 million). Despite being the smallest in size, GenomeAsia 100 K achieved the best imputation accuracy with a median genotype concordance rate of 0.97. For rare variants, GenomeAsia 100 K also offered the best accuracy, although rare variants were less accurately imputable than common variants (30.3% reduction in concordance rates). The high accuracy observed when using GenomeAsia 100 K is likely attributable to the diverse representation of populations genetically similar to the study cohort emphasizing the benefits of sequencing populations classically underrepresented in human genomics.
UR - http://www.scopus.com/inward/record.url?scp=85166165358&partnerID=8YFLogxK
U2 - https://doi.org/10.1038/s41598-023-39429-3
DO - https://doi.org/10.1038/s41598-023-39429-3
M3 - Article
C2 - 37524845
SN - 2045-2322
VL - 13
JO - Scientific reports
JF - Scientific reports
IS - 1
M1 - 12360
ER -