Data Extraction and Management in Networks of Observational Health Care Databases for Scientific Research: A Comparison of EU-ADR, OMOP, Mini-Sentinel and MATRICE Strategies

Rosa Gini, Martijn Schuemie, Jeffrey Brown, Patrick Ryan, Edoardo Vacchi, Massimo Coppola, Walter Cazzola, Preciosa Coloma, Roberto Berni, Gayo Diallo, José Luis Oliveira, Paul Avillach, Gianluca Trifirò, Peter Rijnbeek, Mariadonata Bellentani, Johan van der Lei, Niek Klazinga, Miriam Sturkenboom

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

We see increased use of existing observational data in order to achieve fast and transparent production of empirical evidence in health care research. Multiple databases are often used to increase power, to assess rare exposures or outcomes, or to study diverse populations. For privacy and sociological reasons, original data on individual subjects can't be shared, requiring a distributed network approach where data processing is performed prior to data sharing. We created a conceptual framework distinguishing three steps in local data processing: (1) data reorganization into a data structure common across the network; (2) derivation of study variables not present in original data; and (3) application of study design to transform longitudinal data into aggregated data sets for statistical analysis. We applied this framework to four case studies to identify similarities and differences in the United States and Europe: Exploring and Understanding Adverse Drug Reactions by Integrative Mining of Clinical Records and Biomedical Knowledge (EU-ADR), Observational Medical Outcomes Partnership (OMOP), the Food and Drug Administration's (FDA's) Mini-Sentinel, and the Italian network-the Integration of Content Management Information on the Territory of Patients with Complex Diseases or with Chronic Conditions (MATRICE). National networks (OMOP, Mini-Sentinel, MATRICE) all adopted shared procedures for local data reorganization. The multinational EU-ADR network needed locally defined procedures to reorganize its heterogeneous data into a common structure. Derivation of new data elements was centrally defined in all networks but the procedure was not shared in EU-ADR. Application of study design was a common and shared procedure in all the case studies. Computer procedures were embodied in different programming languages, including SAS, R, SQL, Java, and C++. Using our conceptual framework we found several areas that would benefit from research to identify optimal standards for production of empirical knowledge from existing databases.an opportunity to advance evidence-based care management. In addition, formalized CM outcomes assessment methodologies will enable us to compare CM effectiveness across health delivery settings
Original languageEnglish
Pages (from-to)1189
JournalEGEMS (Washington, DC)
Volume4
Issue number1
DOIs
Publication statusPublished - 2016

Cite this