PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing
PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing
Introduction: primary sclerosing cholangitis (PSC) is a rare and difficult to treat condition. PSC is strongly associated with malignancy, therefore screening and surveillance are paramount. PSC however does not have a unique UK ICD-10 diagnostic code, hence reliable patient cohort identification and thorough service evaluation is challenging. We used natural language processing (NLP) to identify the PSC patient cohort at University Hospital Southampton (UHS) and audited associated outcomes against recently updated British Society of Gastroenterology (BSG) management guidelines.
Method: records of all patients with PSC at our institution between 2008-2020 were identified using our NLP methodology. We used fuzzy matching to analyse clinical records, and tokenized and lemmatized key paragraphs to identify key diagnostic patterns and exclude diagnostically uncertain or exclusive sentences. Anonymised discharge summaries, clinic letters, radiology reports, endoscopy records and histology were extracted and digitally trawled to identify the cohort characteristics.
Results: we identified 125 patients with PSC followed-up at UHS. 39.2% (49) of these patients were missed in a parallel criterion-based review of case notes.
We calculated an age-standardised point prevalence of 12.52 cases per 100,000 patients, 124% higher than typically cited UK figures. Service evaluation revealed high rates of clinic follow-up however lower than recommended rates of screening with colonoscopy and imaging (see Table 1). Introduction of a combined PSC/IBD clinic as a targeted service delivery intervention is addressing this shortfall with significant impact after 1 year. [PTH-36 Table 1 not included].
Conclusions: PSC cohorts are difficult to identify due to a lack of a UK clinical code. An NLP based methodology proved highly effective at identifying all cases within our institution, with a 64.5% increase compared to conventional methods. This allowed rapid patient cohort identification and conversion of unstructured data to clinically useful structured data and could be reproduced at other institutions and for other diseases.
A188-A189
Livingstone, Robert
a84784e0-c608-40b2-8aec-83369b4b159e
Phan, Hang
2811b94c-62b7-459d-9cc1-c88057008e3b
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Sarkar, Srishti
da8b5d21-3c9a-4a50-a581-776d6b63dbd3
Minto, Moeed
eab66e61-6d4e-459a-af6d-0637eab3fcef
Dixey, Annie
8284eb96-ef07-4d6b-bfaa-b727107cb524
Qaisar, Razzi
fc2fa660-10e6-4f5b-8c0c-25a571ac9738
Patel, Janisha
e44aaa1e-7bdd-4b67-9fdf-a21767f00bee
Stammers, Matthew
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2
Livingstone, Robert
a84784e0-c608-40b2-8aec-83369b4b159e
Phan, Hang
2811b94c-62b7-459d-9cc1-c88057008e3b
Borca, Florina
31fc3965-6bcf-4fd6-85bc-8b0f99f62473
Sarkar, Srishti
da8b5d21-3c9a-4a50-a581-776d6b63dbd3
Minto, Moeed
eab66e61-6d4e-459a-af6d-0637eab3fcef
Dixey, Annie
8284eb96-ef07-4d6b-bfaa-b727107cb524
Qaisar, Razzi
fc2fa660-10e6-4f5b-8c0c-25a571ac9738
Patel, Janisha
e44aaa1e-7bdd-4b67-9fdf-a21767f00bee
Stammers, Matthew
a4ad3bd5-7323-4a6d-9c00-2c34f8ae5bd3
Gwiggner, Markus
af72b597-1ead-4155-a25c-0835f7e560c2
Livingstone, Robert, Phan, Hang, Borca, Florina, Sarkar, Srishti, Minto, Moeed, Dixey, Annie, Qaisar, Razzi, Patel, Janisha, Stammers, Matthew and Gwiggner, Markus
(2021)
PTH-36 identification & service evaluation of a primary sclerosing cholangitis cohort using natural language processing.
Gut, 70, .
(doi:10.1136/gutjnl-2021-BSG.351).
Record type:
Meeting abstract
Abstract
Introduction: primary sclerosing cholangitis (PSC) is a rare and difficult to treat condition. PSC is strongly associated with malignancy, therefore screening and surveillance are paramount. PSC however does not have a unique UK ICD-10 diagnostic code, hence reliable patient cohort identification and thorough service evaluation is challenging. We used natural language processing (NLP) to identify the PSC patient cohort at University Hospital Southampton (UHS) and audited associated outcomes against recently updated British Society of Gastroenterology (BSG) management guidelines.
Method: records of all patients with PSC at our institution between 2008-2020 were identified using our NLP methodology. We used fuzzy matching to analyse clinical records, and tokenized and lemmatized key paragraphs to identify key diagnostic patterns and exclude diagnostically uncertain or exclusive sentences. Anonymised discharge summaries, clinic letters, radiology reports, endoscopy records and histology were extracted and digitally trawled to identify the cohort characteristics.
Results: we identified 125 patients with PSC followed-up at UHS. 39.2% (49) of these patients were missed in a parallel criterion-based review of case notes.
We calculated an age-standardised point prevalence of 12.52 cases per 100,000 patients, 124% higher than typically cited UK figures. Service evaluation revealed high rates of clinic follow-up however lower than recommended rates of screening with colonoscopy and imaging (see Table 1). Introduction of a combined PSC/IBD clinic as a targeted service delivery intervention is addressing this shortfall with significant impact after 1 year. [PTH-36 Table 1 not included].
Conclusions: PSC cohorts are difficult to identify due to a lack of a UK clinical code. An NLP based methodology proved highly effective at identifying all cases within our institution, with a 64.5% increase compared to conventional methods. This allowed rapid patient cohort identification and conversion of unstructured data to clinically useful structured data and could be reproduced at other institutions and for other diseases.
This record has no associated files available for download.
More information
e-pub ahead of print date: 7 November 2021
Identifiers
Local EPrints ID: 478006
URI: http://eprints.soton.ac.uk/id/eprint/478006
ISSN: 1468-3288
PURE UUID: e7e7daa2-fa4e-4cac-b9fa-12ce22995d03
Catalogue record
Date deposited: 19 Jun 2023 16:52
Last modified: 21 Sep 2024 02:15
Export record
Altmetrics
Contributors
Author:
Robert Livingstone
Author:
Hang Phan
Author:
Florina Borca
Author:
Srishti Sarkar
Author:
Moeed Minto
Author:
Annie Dixey
Author:
Razzi Qaisar
Author:
Janisha Patel
Author:
Matthew Stammers
Author:
Markus Gwiggner
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics