The Dementias Platform UK (DPUK) Data Portal

Sarah Bauermeister, Christopher Orton, Simon Thompson, Roger A. Barker, Joshua R. Bauermeister, Yoav Ben-Shlomo, Carol Brayne, David Burn, Archie Campbell, Catherine Calvin, Siddharthan Chandran, Nishi Chaturvedi, Geneviève Chêne, Iain P. Chessell, Anne Corbett, Daniel H.J. Davis, Mike Denis, Carole Dufouil, Paul Elliott, Nick FoxDerek Hill, Scott M. Hofer, Michele T. Hu, Christoph Jindra, Frank Kee, Chi Hun Kim, Changsoo Kim, Mika Kivimaki, Ivan Koychev, Rachael A. Lawson, Gerry J. Linden, Ronan A. Lyons, Clare Mackay, Paul M. Matthews, Bernadette McGuiness, Lefkos Middleton, Catherine Moody, Katrina Moore, Duk L. Na, John T. O’Brien, Sebastien Ourselin, Shantini Paranjothy, Ki Soo Park, David J. Porteous, Marcus Richards, Craig W. Ritchie, Jonathan D. Rohrer, Martin N. Rossor, James B. Rowe, Rachael Scahill, Christian Schnier, Jonathan M. Schott, Sang W. Seo, Matthew South, Matthew Steptoe, Sarah J. Tabrizi, Andrea Tales, Therese Tillin, Nicholas J. Timpson, Arthur W. Toga, Pieter Jelle Visser, Richard Wade-Martins, Tim Wilkinson, Julie Williams, Andrew Wong, John E.J. Gallacher

Research output: Contribution to journalArticleAcademicpeer-review

38 Citations (Scopus)

Abstract

The Dementias Platform UK Data Portal is a data repository facilitating access to data for 3 370 929 individuals in 42 cohorts. The Data Portal is an end-to-end data management solution providing a secure, fully auditable, remote access environment for the analysis of cohort data. All projects utilising the data are by default collaborations with the cohort research teams generating the data. The Data Portal uses UK Secure eResearch Platform infrastructure to provide three core utilities: data discovery, access, and analysis. These are delivered using a 7 layered architecture comprising: data ingestion, data curation, platform interoperability, data discovery, access brokerage, data analysis and knowledge preservation. Automated, streamlined, and standardised procedures reduce the administrative burden for all stakeholders, particularly for requests involving multiple independent datasets, where a single request may be forwarded to multiple data controllers. Researchers are provided with their own secure ‘lab’ using VMware which is accessed using two factor authentication. Over the last 2 years, 160 project proposals involving 579 individual cohort data access requests were received. These were received from 268 applicants spanning 72 institutions (56 academic, 13 commercial, 3 government) in 16 countries with 84 requests involving multiple cohorts. Projects are varied including multi-modal, machine learning, and Mendelian randomisation analyses. Data access is usually free at point of use although a small number of cohorts require a data access fee.

Original languageEnglish
Pages (from-to)601-611
Number of pages11
JournalEuropean Journal of Epidemiology
Volume35
Issue number6
DOIs
Publication statusPublished - 1 Jun 2020

Keywords

  • Cohorts
  • Data access
  • Data management
  • Data platform
  • Data repository
  • Epidemiology

Cite this