Accuracy of approximations to recover incompletely reported logistic regression models depended on other available information

Toshihiko Takada, Jeroen Hoogland, Chris van Lieshout, Ewoud Schuit, Gary S. Collins, Karel G. M. Moons, Johannes B. Reitsma

Research output: Contribution to journalArticleAcademicpeer-review


Objective: To provide approximations to recover the full regression equation across different scenarios of incompletely reported prediction models that were developed from binary logistic regression. Study design and setting: In a case study, we considered four common scenarios and illustrated their corresponding approximations: (A) Missing: the intercept, Available: the regression coefficients of predictors, overall frequency of the outcome and descriptive statistics of the predictors; (B) Missing: regression coefficients and the intercept, Available: a simplified score; (C) Missing: regression coefficients and the intercept, Available: a nomogram; (D) Missing: regression coefficients and the intercept, Available: a web calculator. Results: In the scenario A, a simplified approach based on the predicted probability corresponding to the average linear predictor was inaccurate. An approximation based on the overall outcome frequency and an approximation of the linear predictor distribution was more accurate, however, the appropriateness of the underlying assumptions cannot be verified in practice. In the scenario B, the recovered equation was inaccurate due to rounding and categorization of risk scores. In the scenarios C and D, the full regression equation could be recovered with minimal error. Conclusion: The accuracy of the approximations in recovering the regression equation varied depending on the available information.
Original languageEnglish
Pages (from-to)81-90
Number of pages10
JournalJournal of clinical epidemiology
Publication statusPublished - 1 Mar 2022


  • Equation
  • Intercept
  • Logistic regression
  • Prediction model
  • Reporting
  • Reverse engineering

Cite this