The University of Southampton
University of Southampton Institutional Repository

IML4DQ: interactive machine learning for data quality with applications in credit risk

IML4DQ: interactive machine learning for data quality with applications in credit risk
IML4DQ: interactive machine learning for data quality with applications in credit risk

Data Quality (DQ) has gained popularity in recent years due to the increasing reliance on data in machine learning (ML). The DQ domain itself can benefit from ML, which is able to learn from large amounts of data, saving time and resources required by manual DQ assurance. To extend the accessibility of ML solutions and incorporate human input, Interactive ML (IML) integrates ML with a user interface (UI) that facilitates a human-in-the-loop approach. Both high-quality data and human involvement are critical in credit risk management (CRM), where poor DQ can lead to incorrect decisions, causing both ethical issues and financial losses. This paper introduces IML4DQ, a novel IML-based solution designed to ensure DQ in CRM through a dedicated UI. The IML4DQ design is grounded in established IML practices and key UI design principles. A rigorous evaluation using behavioral change theories reveals new insights into the significance of instrumental attitude and government- and management-based norms in shaping attitudes towards DQ in CRM, as well as positive attitude towards automating DQ processes with IML.

Credit risk, Data quality, Interactive machine learning
0302-9743
315-326
Springer Cham
Tiukhova, Elena
d892421d-5c0a-4091-9af2-a738e71518e7
Salcuni, Adriano
25d3ff55-e6c3-44ab-977e-e36c91f005ed
Oguz, Can
d557a13a-758f-4e65-9135-384111315954
Forte, Fabio
26f37cd9-1d89-4496-a9c7-bad292c3efee
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Snoeck, Monique
9aee96bc-8a57-4c37-bcd7-e83f0b173ee1
Comuzzi, Marco
Grigori, Daniela
Sellami, Mohamed
Zhou, Zhangbing
Tiukhova, Elena
d892421d-5c0a-4091-9af2-a738e71518e7
Salcuni, Adriano
25d3ff55-e6c3-44ab-977e-e36c91f005ed
Oguz, Can
d557a13a-758f-4e65-9135-384111315954
Forte, Fabio
26f37cd9-1d89-4496-a9c7-bad292c3efee
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Snoeck, Monique
9aee96bc-8a57-4c37-bcd7-e83f0b173ee1
Comuzzi, Marco
Grigori, Daniela
Sellami, Mohamed
Zhou, Zhangbing

Tiukhova, Elena, Salcuni, Adriano, Oguz, Can, Forte, Fabio, Baesens, Bart and Snoeck, Monique (2025) IML4DQ: interactive machine learning for data quality with applications in credit risk. Comuzzi, Marco, Grigori, Daniela, Sellami, Mohamed and Zhou, Zhangbing (eds.) In Cooperative Information Systems - 30th International Conference, CoopIS 2024, Proceedings. vol. 15506, Springer Cham. pp. 315-326 . (doi:10.1007/978-3-031-81375-7_18).

Record type: Conference or Workshop Item (Paper)

Abstract

Data Quality (DQ) has gained popularity in recent years due to the increasing reliance on data in machine learning (ML). The DQ domain itself can benefit from ML, which is able to learn from large amounts of data, saving time and resources required by manual DQ assurance. To extend the accessibility of ML solutions and incorporate human input, Interactive ML (IML) integrates ML with a user interface (UI) that facilitates a human-in-the-loop approach. Both high-quality data and human involvement are critical in credit risk management (CRM), where poor DQ can lead to incorrect decisions, causing both ethical issues and financial losses. This paper introduces IML4DQ, a novel IML-based solution designed to ensure DQ in CRM through a dedicated UI. The IML4DQ design is grounded in established IML practices and key UI design principles. A rigorous evaluation using behavioral change theories reveals new insights into the significance of instrumental attitude and government- and management-based norms in shaping attitudes towards DQ in CRM, as well as positive attitude towards automating DQ processes with IML.

Text
CoopIS_2024_camera_ready - Accepted Manuscript
Restricted to Repository staff only until 14 February 2026.
Request a copy

More information

Accepted/In Press date: 15 September 2024
Published date: 14 February 2025
Keywords: Credit risk, Data quality, Interactive machine learning

Identifiers

Local EPrints ID: 500464
URI: http://eprints.soton.ac.uk/id/eprint/500464
ISSN: 0302-9743
PURE UUID: a5e4ee4c-d3f2-47ca-938d-5dcde5447ab4
ORCID for Bart Baesens: ORCID iD orcid.org/0000-0002-5831-5668

Catalogue record

Date deposited: 30 Apr 2025 16:57
Last modified: 01 May 2025 01:39

Export record

Altmetrics

Contributors

Author: Elena Tiukhova
Author: Adriano Salcuni
Author: Can Oguz
Author: Fabio Forte
Author: Bart Baesens ORCID iD
Author: Monique Snoeck
Editor: Marco Comuzzi
Editor: Daniela Grigori
Editor: Mohamed Sellami
Editor: Zhangbing Zhou

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×