Conformal Compressors
Conformal Compressors
Standard machine learning predictors become more effective as larger input datasets are made available. While this is desirable for enhancing predictive power, it often implies substantial computational costs. One feasible approach to mitigate this issue is to replace large datasets with smaller, carefully crafted representations that retain the essential properties of the original data. In this paper, we revisit the exploration of this approach through the interaction of coresets - small, provably correct summaries of data - and Conformal Prediction, a robust and general method for calibrating machine learning predictions. Specifically, we build on existing work to introduce Conformal Compressors, a method inspired by coresets that leverages Conformal Prediction for data compression. Initial results indicate that these compressors effectively capture meaningful information from the data while demonstrating significantly better stability and reliability compared to Uniform Random Sampling and state-of-the-art coreset constructions.
machine learning, data reduction, conformal prediction, coresets
Riquelme Granada, Nery
0e85e1ec-9964-4d98-8e80-98d585371b27
Nguyen, Khuong
0f8e7820-0a65-4538-9a40-c642f4623c0a
Luo, Zhiyuan
4d7f51ba-822e-4a6a-ac96-2f8ec041f414
Riquelme Granada, Nery
0e85e1ec-9964-4d98-8e80-98d585371b27
Nguyen, Khuong
0f8e7820-0a65-4538-9a40-c642f4623c0a
Luo, Zhiyuan
4d7f51ba-822e-4a6a-ac96-2f8ec041f414
Abstract
Standard machine learning predictors become more effective as larger input datasets are made available. While this is desirable for enhancing predictive power, it often implies substantial computational costs. One feasible approach to mitigate this issue is to replace large datasets with smaller, carefully crafted representations that retain the essential properties of the original data. In this paper, we revisit the exploration of this approach through the interaction of coresets - small, provably correct summaries of data - and Conformal Prediction, a robust and general method for calibrating machine learning predictions. Specifically, we build on existing work to introduce Conformal Compressors, a method inspired by coresets that leverages Conformal Prediction for data compression. Initial results indicate that these compressors effectively capture meaningful information from the data while demonstrating significantly better stability and reliability compared to Uniform Random Sampling and state-of-the-art coreset constructions.
Text
1-s2.0-S0031320325011781-main
- Version of Record
More information
Accepted/In Press date: 25 September 2025
e-pub ahead of print date: 3 October 2025
Keywords:
machine learning, data reduction, conformal prediction, coresets
Identifiers
Local EPrints ID: 506384
URI: http://eprints.soton.ac.uk/id/eprint/506384
ISSN: 0031-3203
PURE UUID: 312cdd68-1f25-4834-a2b7-c8c4c1f2dde1
Catalogue record
Date deposited: 05 Nov 2025 17:51
Last modified: 05 Nov 2025 18:11
Export record
Altmetrics
Contributors
Author:
Nery Riquelme Granada
Author:
Khuong Nguyen
Author:
Zhiyuan Luo
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics