TY - JOUR
T1 - Inventory of tools for Dutch clinical language processing
AU - Cornet, Ronald
AU - van Eldik, Armand
AU - de Keizer, Nicolette
PY - 2012
Y1 - 2012
N2 - Automated encoding of free-text clinical narratives using concepts from terminological systems is widely performed. However, the majority of natural language processing (NLP) tools and terminological systems involve the English language. As parts of the NLP process are language independent, and tools for various languages are available, an overview is needed to determine the applicability to performing NLP of Dutch medical texts. To this end an inventory of tools is created. A literature study and internet search were performed to describe available components for a Dutch NLP system, enabling to encode Dutch text as structured SNOMED CT output without the need to translate SNOMED CT in Dutch. We have found 31 papers, describing a variety of NLP frameworks and tools for the various NLP components for processing English and Dutch free text. Most of them are suitable for English free text, some of them are (also) usable for Dutch. To enable automated encoding of Dutch free text narratives, further research is needed to create a spelling checker, a negation detector, a domain-specific abbreviation/acronym list, and a concept mapper (to map Dutch terms to concepts in a terminological system). Furthermore evaluation of performance for the Dutch 'medical' language is needed
AB - Automated encoding of free-text clinical narratives using concepts from terminological systems is widely performed. However, the majority of natural language processing (NLP) tools and terminological systems involve the English language. As parts of the NLP process are language independent, and tools for various languages are available, an overview is needed to determine the applicability to performing NLP of Dutch medical texts. To this end an inventory of tools is created. A literature study and internet search were performed to describe available components for a Dutch NLP system, enabling to encode Dutch text as structured SNOMED CT output without the need to translate SNOMED CT in Dutch. We have found 31 papers, describing a variety of NLP frameworks and tools for the various NLP components for processing English and Dutch free text. Most of them are suitable for English free text, some of them are (also) usable for Dutch. To enable automated encoding of Dutch free text narratives, further research is needed to create a spelling checker, a negation detector, a domain-specific abbreviation/acronym list, and a concept mapper (to map Dutch terms to concepts in a terminological system). Furthermore evaluation of performance for the Dutch 'medical' language is needed
M3 - Article
C2 - 22874189
SN - 0926-9630
VL - 180
SP - 245
EP - 249
JO - Studies in health technology and informatics
JF - Studies in health technology and informatics
ER -