The University of Southampton
University of Southampton Institutional Repository

P319 natural language processing and named entity recognition in inflammatory bowel disease referrals

P319 natural language processing and named entity recognition in inflammatory bowel disease referrals
P319 natural language processing and named entity recognition in inflammatory bowel disease referrals
Introduction: clinical natural language processing (NLP) techniques are evolving, such that over the next few years they will start to support clinicians to interpret clinical information. Named entity recognition and linkage (NER+L) to standard ontologies with millions of concepts, such as The Unified Medical Language System (UMLS) add value to otherwise unstructured textual data. However, little research has been done in the field of Inflammatory Bowel Disease (IBD).

Methods: anonymised GP referral letters triaged between 1st January 2017 to 31st March 2021 using an agreed protocol by a panel of Gastroenterologists as likely new or recurrent IBD were randomly extracted. NLP in python was applied to referral free text using MedCAT, a model trained on the UMLS database.
Manual validation was performed to determine sensitivity vs ground truth for finding positive mentions of four cardinal clinical signs and symptoms. Sensitivity = TP/(TP + FN) was the outcome of greatest interest. Chi2 was used for statistical comparison at the p<0.05 level.

Results: 125 referral letters were included in this study. Median age: 39(IQR:[30-50]), 51.2% Male[95%CI:42.4-60.1]. 22.4%(n=28) of the cohort had pre-existing IBD. Table 1 summarises the performance of the algorithm against the correct human validations: [P319 Table 1 NLP model outcome parameters not included]. Diarrhoea and abdominal pain were both most mentioned and most successfully detected by MedCAT, however, significant differences were flagged in all cases.

Conclusions: significant differences were observed between human validations and model predictions for four common IBD signs and symptoms, suggesting that these models are not yet mature enough for use in clinical practice. Annotations for more difficult concepts, such as rectal bleeding and weight loss need to be improved in major open-source NLP corpora.
1468-3288
A194-A195
Stammers, Matt
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
George, Michael
73c78842-9866-4912-afb4-60a6d0c64822
Regas, Constantinos
eef3cec5-ec27-44f2-91d5-19bdc30285dd
May, Georgia
217900c9-cce6-4e3c-9651-b31475975400
Rahmany, Sohail
9345a4c5-0294-4edf-b8c9-8b88b1627fb7
Davis, Cai
a4aa4473-7285-41d5-8807-11717fdddb05
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Knibbs, Will
275c6a4c-138f-4e03-877f-e387de658a5f
Chandrabalan, Vishnu V.
b29c126b-e258-4584-9f69-2da5a2f7a583
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2
Stammers, Matt
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
George, Michael
73c78842-9866-4912-afb4-60a6d0c64822
Regas, Constantinos
eef3cec5-ec27-44f2-91d5-19bdc30285dd
May, Georgia
217900c9-cce6-4e3c-9651-b31475975400
Rahmany, Sohail
9345a4c5-0294-4edf-b8c9-8b88b1627fb7
Davis, Cai
a4aa4473-7285-41d5-8807-11717fdddb05
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Knibbs, Will
275c6a4c-138f-4e03-877f-e387de658a5f
Chandrabalan, Vishnu V.
b29c126b-e258-4584-9f69-2da5a2f7a583
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2

Stammers, Matt, George, Michael, Regas, Constantinos, May, Georgia, Rahmany, Sohail, Davis, Cai, Borca, Florina, Knibbs, Will, Chandrabalan, Vishnu V. and Gwiggner, Markus (2022) P319 natural language processing and named entity recognition in inflammatory bowel disease referrals. Gut, 71, A194-A195. (doi:10.1136/gutjnl-2022-BSG.370).

Record type: Meeting abstract

Abstract

Introduction: clinical natural language processing (NLP) techniques are evolving, such that over the next few years they will start to support clinicians to interpret clinical information. Named entity recognition and linkage (NER+L) to standard ontologies with millions of concepts, such as The Unified Medical Language System (UMLS) add value to otherwise unstructured textual data. However, little research has been done in the field of Inflammatory Bowel Disease (IBD).

Methods: anonymised GP referral letters triaged between 1st January 2017 to 31st March 2021 using an agreed protocol by a panel of Gastroenterologists as likely new or recurrent IBD were randomly extracted. NLP in python was applied to referral free text using MedCAT, a model trained on the UMLS database.
Manual validation was performed to determine sensitivity vs ground truth for finding positive mentions of four cardinal clinical signs and symptoms. Sensitivity = TP/(TP + FN) was the outcome of greatest interest. Chi2 was used for statistical comparison at the p<0.05 level.

Results: 125 referral letters were included in this study. Median age: 39(IQR:[30-50]), 51.2% Male[95%CI:42.4-60.1]. 22.4%(n=28) of the cohort had pre-existing IBD. Table 1 summarises the performance of the algorithm against the correct human validations: [P319 Table 1 NLP model outcome parameters not included]. Diarrhoea and abdominal pain were both most mentioned and most successfully detected by MedCAT, however, significant differences were flagged in all cases.

Conclusions: significant differences were observed between human validations and model predictions for four common IBD signs and symptoms, suggesting that these models are not yet mature enough for use in clinical practice. Annotations for more difficult concepts, such as rectal bleeding and weight loss need to be improved in major open-source NLP corpora.

This record has no associated files available for download.

More information

e-pub ahead of print date: 19 June 2022

Identifiers

Local EPrints ID: 478009
URI: http://eprints.soton.ac.uk/id/eprint/478009
ISSN: 1468-3288
PURE UUID: 4a61ae8d-db31-430f-b7ca-28989f6dbfed
ORCID for Matt Stammers: ORCID iD orcid.org/0000-0003-3850-3116

Catalogue record

Date deposited: 19 Jun 2023 16:55
Last modified: 21 Sep 2024 02:15

Export record

Altmetrics

Contributors

Author: Matt Stammers ORCID iD
Author: Michael George
Author: Constantinos Regas
Author: Georgia May
Author: Sohail Rahmany
Author: Cai Davis
Author: Florina Borca
Author: Will Knibbs
Author: Vishnu V. Chandrabalan
Author: Markus Gwiggner

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×