Record linkage: making the most out of errors in linking variables

M. Tromp, J. B. Reitsma, A. C. J. Ravelli, N. Méray, G. J. Bonsel

Research output: Contribution to journalArticleAcademicpeer-review

20 Citations (Scopus)

Abstract

This paper presents a refinement of the probabilistic medical record linking algorithm. We introduced "close agreement" to account for typical errors in administrative variables used for record linkage. Linking data on early pregnancy determinants with data on late child outcomes was used as a case study. We analyzed whether the addition of close agreement resulted in a higher discriminating power of the linking key reflected ina reduction of the number of links with an uncertain linking status. Incorporating close agreement for postal code and date of birth in the record linking algorithm resulted in a reduction of 95% of the number of pairs in the uncertain region.We showed that the extension of a third outcome"close" when comparing values of corresponding linking variables led to a major improvement in our probabilistic record linkage study. Similar improvements are likely in other studies because the frequency, nature, and type of errors in other large databases will not be substantially different
Original languageEnglish
Pages (from-to)779-783
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
Publication statusE-pub ahead of print - 2006

Cite this