The University of Southampton
University of Southampton Institutional Repository

PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing

PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing
PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing
Introduction: primary sclerosing cholangitis (PSC) is a rare and difficult to treat condition. PSC is strongly associated with malignancy, therefore screening and surveillance are paramount. PSC however does not have a unique UK ICD-10 diagnostic code, hence reliable patient cohort identification and thorough service evaluation is challenging. We used natural language processing (NLP) to identify the PSC patient cohort at University Hospital Southampton (UHS) and audited associated outcomes against recently updated British Society of Gastroenterology (BSG) management guidelines.

Method: records of all patients with PSC at our institution between 2008-2020 were identified using our NLP methodology. We used fuzzy matching to analyse clinical records, and tokenized and lemmatized key paragraphs to identify key diagnostic patterns and exclude diagnostically uncertain or exclusive sentences. Anonymised discharge summaries, clinic letters, radiology reports, endoscopy records and histology were extracted and digitally trawled to identify the cohort characteristics.

Results: we identified 125 patients with PSC followed-up at UHS. 39.2% (49) of these patients were missed in a parallel criterion-based review of case notes.
We calculated an age-standardised point prevalence of 12.52 cases per 100,000 patients, 124% higher than typically cited UK figures. Service evaluation revealed high rates of clinic follow-up however lower than recommended rates of screening with colonoscopy and imaging (see Table 1). Introduction of a combined PSC/IBD clinic as a targeted service delivery intervention is addressing this shortfall with significant impact after 1 year. [PTH-36 Table 1 not included].

Conclusions: PSC cohorts are difficult to identify due to a lack of a UK clinical code. An NLP based methodology proved highly effective at identifying all cases within our institution, with a 64.5% increase compared to conventional methods. This allowed rapid patient cohort identification and conversion of unstructured data to clinically useful structured data and could be reproduced at other institutions and for other diseases.
1468-3288
A188-A189
Livingstone, Robert
a84784e0-c608-40b2-8aec-83369b4b159e
Phan, Hang
2811b94c-62b7-459d-9cc1-c88057008e3b
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Sarkar, Srishti
da8b5d21-3c9a-4a50-a581-776d6b63dbd3
Minto, Moeed
eab66e61-6d4e-459a-af6d-0637eab3fcef
Dixey, Annie
8284eb96-ef07-4d6b-bfaa-b727107cb524
Qaisar, Razzi
fc2fa660-10e6-4f5b-8c0c-25a571ac9738
Patel, Janisha
e44aaa1e-7bdd-4b67-9fdf-a21767f00bee
Stammers, Matthew
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2
Livingstone, Robert
a84784e0-c608-40b2-8aec-83369b4b159e
Phan, Hang
2811b94c-62b7-459d-9cc1-c88057008e3b
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Sarkar, Srishti
da8b5d21-3c9a-4a50-a581-776d6b63dbd3
Minto, Moeed
eab66e61-6d4e-459a-af6d-0637eab3fcef
Dixey, Annie
8284eb96-ef07-4d6b-bfaa-b727107cb524
Qaisar, Razzi
fc2fa660-10e6-4f5b-8c0c-25a571ac9738
Patel, Janisha
e44aaa1e-7bdd-4b67-9fdf-a21767f00bee
Stammers, Matthew
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2

Livingstone, Robert, Phan, Hang, Borca, Florina, Sarkar, Srishti, Minto, Moeed, Dixey, Annie, Qaisar, Razzi, Patel, Janisha, Stammers, Matthew and Gwiggner, Markus (2021) PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing. Gut, 70, A188-A189. (doi:10.1136/gutjnl-2021-BSG.351).

Record type: Meeting abstract

Abstract

Introduction: primary sclerosing cholangitis (PSC) is a rare and difficult to treat condition. PSC is strongly associated with malignancy, therefore screening and surveillance are paramount. PSC however does not have a unique UK ICD-10 diagnostic code, hence reliable patient cohort identification and thorough service evaluation is challenging. We used natural language processing (NLP) to identify the PSC patient cohort at University Hospital Southampton (UHS) and audited associated outcomes against recently updated British Society of Gastroenterology (BSG) management guidelines.

Method: records of all patients with PSC at our institution between 2008-2020 were identified using our NLP methodology. We used fuzzy matching to analyse clinical records, and tokenized and lemmatized key paragraphs to identify key diagnostic patterns and exclude diagnostically uncertain or exclusive sentences. Anonymised discharge summaries, clinic letters, radiology reports, endoscopy records and histology were extracted and digitally trawled to identify the cohort characteristics.

Results: we identified 125 patients with PSC followed-up at UHS. 39.2% (49) of these patients were missed in a parallel criterion-based review of case notes.
We calculated an age-standardised point prevalence of 12.52 cases per 100,000 patients, 124% higher than typically cited UK figures. Service evaluation revealed high rates of clinic follow-up however lower than recommended rates of screening with colonoscopy and imaging (see Table 1). Introduction of a combined PSC/IBD clinic as a targeted service delivery intervention is addressing this shortfall with significant impact after 1 year. [PTH-36 Table 1 not included].

Conclusions: PSC cohorts are difficult to identify due to a lack of a UK clinical code. An NLP based methodology proved highly effective at identifying all cases within our institution, with a 64.5% increase compared to conventional methods. This allowed rapid patient cohort identification and conversion of unstructured data to clinically useful structured data and could be reproduced at other institutions and for other diseases.

This record has no associated files available for download.

More information

e-pub ahead of print date: 7 November 2021

Identifiers

Local EPrints ID: 478006
URI: http://eprints.soton.ac.uk/id/eprint/478006
ISSN: 1468-3288
PURE UUID: e7e7daa2-fa4e-4cac-b9fa-12ce22995d03
ORCID for Matthew Stammers: ORCID iD orcid.org/0000-0003-3850-3116

Catalogue record

Date deposited: 19 Jun 2023 16:52
Last modified: 21 Sep 2024 02:15

Export record

Altmetrics

Contributors

Author: Robert Livingstone
Author: Hang Phan
Author: Florina Borca
Author: Srishti Sarkar
Author: Moeed Minto
Author: Annie Dixey
Author: Razzi Qaisar
Author: Janisha Patel
Author: Matthew Stammers ORCID iD
Author: Markus Gwiggner

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×