LIMSI@CLEF eHealth 2017 task 2: Logistic regression for automatic article ranking

Christopher Norman; Mariska Leeflang; Aurélie Névéol

LIMSI@CLEF eHealth 2017 task 2: Logistic regression for automatic article ranking

Christopher Norman, Mariska Leeflang, Aurélie Névéol

Research output: Contribution to conference › Paper › Academic

2 Citations (Scopus)

Abstract

This paper describes the participation of the LIMSI-MIROR team at CLEF eHealth 2017, task 2. The task addresses the automatic ranking of articles in order to assist with the screening process of Diagnostic Test Accuracy (DTA) Systematic Reviews. We used a logistic regression classifier and handled class imbalance using a combination of class reweighting and undersampling. We also experimented with two strategies for relevance feedback. Our best run obtained an overall Average Precision of 0.179 and Work Saved over Sampling @95% Recall of 0.650. This run uses stochastic gradient descent for training but no feature selection or relevance feedback. We observe high performance variation within the queries in the test set. Nonetheless, our results suggest that automatic assistance is promising for ranking the DTA literature as it could reduce the screening workload for review writer by 65% on average.

Original language	English
Publication status	Published - 2017
Event	18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017 - Dublin, Ireland Duration: 11 Sept 2017 → 14 Sept 2017

Conference

Conference	18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017
Country/Territory	Ireland
City	Dublin
Period	11/09/2017 → 14/09/2017

Keywords

Evidence based medicine
Information storage and retrieval
Review literature as topic
Supervised machine learning

Cite this

@conference{f2340497cf70466db97aad2ab26cf0c7,

title = "LIMSI@CLEF eHealth 2017 task 2: Logistic regression for automatic article ranking",

abstract = "This paper describes the participation of the LIMSI-MIROR team at CLEF eHealth 2017, task 2. The task addresses the automatic ranking of articles in order to assist with the screening process of Diagnostic Test Accuracy (DTA) Systematic Reviews. We used a logistic regression classifier and handled class imbalance using a combination of class reweighting and undersampling. We also experimented with two strategies for relevance feedback. Our best run obtained an overall Average Precision of 0.179 and Work Saved over Sampling @95% Recall of 0.650. This run uses stochastic gradient descent for training but no feature selection or relevance feedback. We observe high performance variation within the queries in the test set. Nonetheless, our results suggest that automatic assistance is promising for ranking the DTA literature as it could reduce the screening workload for review writer by 65% on average.",

keywords = "Evidence based medicine, Information storage and retrieval, Review literature as topic, Supervised machine learning",

author = "Christopher Norman and Mariska Leeflang and Aur{\'e}lie N{\'e}v{\'e}ol",

note = "Funding Information: This project has received funding from the European Union{\textquoteright}s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 676207.; 18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017 ; Conference date: 11-09-2017 Through 14-09-2017",

year = "2017",

language = "English",

}

TY - CONF

T1 - LIMSI@CLEF eHealth 2017 task 2

T2 - 18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017

AU - Norman, Christopher

AU - Leeflang, Mariska

AU - Névéol, Aurélie

N1 - Funding Information: This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 676207.

PY - 2017

Y1 - 2017

N2 - This paper describes the participation of the LIMSI-MIROR team at CLEF eHealth 2017, task 2. The task addresses the automatic ranking of articles in order to assist with the screening process of Diagnostic Test Accuracy (DTA) Systematic Reviews. We used a logistic regression classifier and handled class imbalance using a combination of class reweighting and undersampling. We also experimented with two strategies for relevance feedback. Our best run obtained an overall Average Precision of 0.179 and Work Saved over Sampling @95% Recall of 0.650. This run uses stochastic gradient descent for training but no feature selection or relevance feedback. We observe high performance variation within the queries in the test set. Nonetheless, our results suggest that automatic assistance is promising for ranking the DTA literature as it could reduce the screening workload for review writer by 65% on average.

AB - This paper describes the participation of the LIMSI-MIROR team at CLEF eHealth 2017, task 2. The task addresses the automatic ranking of articles in order to assist with the screening process of Diagnostic Test Accuracy (DTA) Systematic Reviews. We used a logistic regression classifier and handled class imbalance using a combination of class reweighting and undersampling. We also experimented with two strategies for relevance feedback. Our best run obtained an overall Average Precision of 0.179 and Work Saved over Sampling @95% Recall of 0.650. This run uses stochastic gradient descent for training but no feature selection or relevance feedback. We observe high performance variation within the queries in the test set. Nonetheless, our results suggest that automatic assistance is promising for ranking the DTA literature as it could reduce the screening workload for review writer by 65% on average.

KW - Evidence based medicine

KW - Information storage and retrieval

KW - Review literature as topic

KW - Supervised machine learning

UR - http://www.scopus.com/inward/record.url?scp=85034747440&partnerID=8YFLogxK

M3 - Paper

Y2 - 11 September 2017 through 14 September 2017

ER -

LIMSI@CLEF eHealth 2017 task 2: Logistic regression for automatic article ranking

Abstract

Conference

Keywords

Other files and links

Cite this