TY - JOUR
T1 - Probabilistic record linkage is a valid and transparent tool to combine databases without a patient identification number
AU - Meray, Nora
AU - Reitsma, Johannes B.
AU - Ravelli, Anita C. J.
AU - Bonsel, Gouke J.
PY - 2007
Y1 - 2007
N2 - OBJECTIVE: To describe the technical approach and subsequent validation of the probabilistic linkage of the three anonymous, population-based Dutch Perinatal Registries (LVR1 of midwives, LVR2 of obstetricians, and LNR of pediatricians/neonatologists). These registries do not share a unique identification number. STUDY DESIGN AND SETTING: A combination of probabilistic and deterministic record linkage techniques were applied using information about the mother, delivery, and child(ren) to link three known registries. Rewards for agreement and penalties for disagreement between corresponding variables were calculated based on the observed patterns of agreement and disagreements using maximum likelihood estimation. Special measures were developed to overcome linking difficulties in twins. A subsample of linked and nonlinked pairs was validated. RESULTS: Independent validation confirmed that the procedure successfully linked the three Dutch perinatal registries despite nontrivial error rates in the linking variables. CONCLUSIONS: Probabilistic linkage techniques allowed the creation of a high-quality linked database from crude registry data. The developed procedures are generally applicable in linkage of health data with partially identifying information. They provide useful source date even if cohorts are only partly overlapping and if within the cohort, multiple entities and twins exist
AB - OBJECTIVE: To describe the technical approach and subsequent validation of the probabilistic linkage of the three anonymous, population-based Dutch Perinatal Registries (LVR1 of midwives, LVR2 of obstetricians, and LNR of pediatricians/neonatologists). These registries do not share a unique identification number. STUDY DESIGN AND SETTING: A combination of probabilistic and deterministic record linkage techniques were applied using information about the mother, delivery, and child(ren) to link three known registries. Rewards for agreement and penalties for disagreement between corresponding variables were calculated based on the observed patterns of agreement and disagreements using maximum likelihood estimation. Special measures were developed to overcome linking difficulties in twins. A subsample of linked and nonlinked pairs was validated. RESULTS: Independent validation confirmed that the procedure successfully linked the three Dutch perinatal registries despite nontrivial error rates in the linking variables. CONCLUSIONS: Probabilistic linkage techniques allowed the creation of a high-quality linked database from crude registry data. The developed procedures are generally applicable in linkage of health data with partially identifying information. They provide useful source date even if cohorts are only partly overlapping and if within the cohort, multiple entities and twins exist
U2 - https://doi.org/10.1016/j.jclinepi.2006.11.021
DO - https://doi.org/10.1016/j.jclinepi.2006.11.021
M3 - Article
C2 - 17689804
SN - 0895-4356
VL - 60
SP - 883
EP - 891
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
IS - 9
ER -