Automated generation of ICD-11 cluster codes for precision medical record classification
Automated generation of ICD-11 cluster codes for precision medical record classification
Accurate clinical coding using the International Classification of Diseases (ICD) standard is essential for healthcare analytics. ICD-11 introduces new coding guidelines and cluster structures, posing challenges for existing coding tools. This research presents an automated approach to generate valid ICD-11 cluster codes from medical text. Natural language records are represented as vectors and compared to an ICD-11 corpus using cosine similarity. A bidirectional matching technique then refines similarity estimation. Experiments demonstrate the method yields up to 0.91 F1 score in coding accuracy, significantly outperforming a baseline tool. This work enables efficient high-quality ICD-11 coding to support healthcare informatics.
ICD code, ICD-11, clinical coding, machine learning, text similarity
Feng, Jiayi
2c955d5e-0116-49ff-8b7f-e56d9412e6a3
Zhang, Runtong
f7e5c4f9-6235-490e-a41c-0377c97f9324
Chen, Donghua
599b7788-df53-45ec-96b7-4bbd60a944b1
Shi, Lei
f1a82e79-8ed6-43d9-8d49-2b05437cc502
Li, Zhaoxing
65935c45-a640-496c-98b8-43bed39e1850
4 January 2024
Feng, Jiayi
2c955d5e-0116-49ff-8b7f-e56d9412e6a3
Zhang, Runtong
f7e5c4f9-6235-490e-a41c-0377c97f9324
Chen, Donghua
599b7788-df53-45ec-96b7-4bbd60a944b1
Shi, Lei
f1a82e79-8ed6-43d9-8d49-2b05437cc502
Li, Zhaoxing
65935c45-a640-496c-98b8-43bed39e1850
Feng, Jiayi, Zhang, Runtong, Chen, Donghua, Shi, Lei and Li, Zhaoxing
(2024)
Automated generation of ICD-11 cluster codes for precision medical record classification.
International Journal of Computers, Communications and Control, 19 (1), [6251].
(doi:10.15837/ijccc.2024.1.6251).
Abstract
Accurate clinical coding using the International Classification of Diseases (ICD) standard is essential for healthcare analytics. ICD-11 introduces new coding guidelines and cluster structures, posing challenges for existing coding tools. This research presents an automated approach to generate valid ICD-11 cluster codes from medical text. Natural language records are represented as vectors and compared to an ICD-11 corpus using cosine similarity. A bidirectional matching technique then refines similarity estimation. Experiments demonstrate the method yields up to 0.91 F1 score in coding accuracy, significantly outperforming a baseline tool. This work enables efficient high-quality ICD-11 coding to support healthcare informatics.
Text
6251
- Version of Record
More information
e-pub ahead of print date: 4 January 2024
Published date: 4 January 2024
Additional Information:
Publisher Copyright:
© 2024 by the authors. Licensee Agora University, Oradea, Romania. This is an open access article distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International License. Journal’s webpage: http://univagora.ro/jour/index.php/ijccc/. All Rights Reserved.
Keywords:
ICD code, ICD-11, clinical coding, machine learning, text similarity
Identifiers
Local EPrints ID: 486483
URI: http://eprints.soton.ac.uk/id/eprint/486483
PURE UUID: 60d4d0f6-b2a0-4b8b-a0e4-40b64e82123b
Catalogue record
Date deposited: 24 Jan 2024 17:35
Last modified: 21 May 2024 02:08
Export record
Altmetrics
Contributors
Author:
Jiayi Feng
Author:
Runtong Zhang
Author:
Donghua Chen
Author:
Lei Shi
Author:
Zhaoxing Li
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics