The University of Southampton
University of Southampton Institutional Repository

BAKER: Bayesian kernel uncertainty in domain-specific document modelling

BAKER: Bayesian kernel uncertainty in domain-specific document modelling
BAKER: Bayesian kernel uncertainty in domain-specific document modelling

In critical domains such as healthcare and law, accurately modelling the uncertainty of automatic computational models is essential. For instance, healthcare models must produce reliable estimates to guide human decision-making. However, modelling uncertainty remains challenging, particularly for models handling low-resource datasets and complex, domain-specific vocabulary. Most existing predictive models model point estimates rather than probability distributions, limiting our ability to quantify model uncertainty. This paper introduces a novel model, BAKER, designed to address these limitations. BAKER combines the strengths of Bayesian inference, known for its effectiveness in modelling uncertainty, and kernel methods, which excel at capturing complex data relationships. Incorporating kernel functions enhances model performance, particularly by reducing overfitting in data-limited scenarios. Our experimental analysis shows that BAKER significantly improves uncertainty reasoning compared to existing models.

Bayesian Inference, Kernel methods, Language models, Reliability
382-391
Association for Computing Machinery
Azam, Ubaid
243c228b-8e17-4bba-9b3f-c788c0f9e858
Razzak, Imran
85c57ead-8a63-4aec-bba3-559a43dd5888
Vishwakarma, Shelly
50ba09b3-b2f4-4e1a-881f-ad26fbb0a1a5
Hacid, Hakim
797389ed-bfaa-426d-977c-73647812ee22
Zhang, Dell
ae078ed1-bc72-431f-a6c9-eaaf9c73e946
Jameel, Shoaib
ae3c588e-4a59-43d9-af41-ea30d7caaf96
Azam, Ubaid
243c228b-8e17-4bba-9b3f-c788c0f9e858
Razzak, Imran
85c57ead-8a63-4aec-bba3-559a43dd5888
Vishwakarma, Shelly
50ba09b3-b2f4-4e1a-881f-ad26fbb0a1a5
Hacid, Hakim
797389ed-bfaa-426d-977c-73647812ee22
Zhang, Dell
ae078ed1-bc72-431f-a6c9-eaaf9c73e946
Jameel, Shoaib
ae3c588e-4a59-43d9-af41-ea30d7caaf96

Azam, Ubaid, Razzak, Imran, Vishwakarma, Shelly, Hacid, Hakim, Zhang, Dell and Jameel, Shoaib (2025) BAKER: Bayesian kernel uncertainty in domain-specific document modelling. In WSDM 2025 - Proceedings of the 18th ACM International Conference on Web Search and Data Mining. Association for Computing Machinery. pp. 382-391 . (doi:10.1145/3701551.3703517).

Record type: Conference or Workshop Item (Paper)

Abstract

In critical domains such as healthcare and law, accurately modelling the uncertainty of automatic computational models is essential. For instance, healthcare models must produce reliable estimates to guide human decision-making. However, modelling uncertainty remains challenging, particularly for models handling low-resource datasets and complex, domain-specific vocabulary. Most existing predictive models model point estimates rather than probability distributions, limiting our ability to quantify model uncertainty. This paper introduces a novel model, BAKER, designed to address these limitations. BAKER combines the strengths of Bayesian inference, known for its effectiveness in modelling uncertainty, and kernel methods, which excel at capturing complex data relationships. Incorporating kernel functions enhances model performance, particularly by reducing overfitting in data-limited scenarios. Our experimental analysis shows that BAKER significantly improves uncertainty reasoning compared to existing models.

Text
3701551.3703517 - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Published date: 10 March 2025
Venue - Dates: 18th ACM International Conference on Web Search and Data Mining, WSDM 2025, , Hannover, Germany, 2025-03-10 - 2025-03-14
Keywords: Bayesian Inference, Kernel methods, Language models, Reliability

Identifiers

Local EPrints ID: 503258
URI: http://eprints.soton.ac.uk/id/eprint/503258
PURE UUID: 66147c07-315d-4deb-a24e-1c9a461dfeaf

Catalogue record

Date deposited: 25 Jul 2025 16:38
Last modified: 21 Aug 2025 05:10

Export record

Altmetrics

Contributors

Author: Ubaid Azam
Author: Imran Razzak
Author: Shelly Vishwakarma
Author: Hakim Hacid
Author: Dell Zhang
Author: Shoaib Jameel

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×