Measuring the consistency with prior knowledge of classification models
Measuring the consistency with prior knowledge of classification models
Growing popularity of the Internet and innovative storage technology have caused a true data explosion. The overall process of extracting knowledge from this vast amount of data is called data mining. Classification is a subtask of data mining and involves the assigning a data point to a predefined class or group according to its predictive characteristics. The classification problem and accompanying data mining techniques are relevant in a wide variety of domains such as financial engineering, medical diagnostic and marketing.
The performance of a classification model is typically measured by its accuracy; however justifiability is also a major requirement in many data mining applications. Justifiability concerns the extent to which the model is in line with prior domain knowledge. The best known justifiability requirement is the monotonicity constraint (e.g. increasing income should yield an increasing probability of being granted a loan). Several adaptations to existing classification techniques have been proposed to cope with justifiability, yet a measure to identify the extent to which the models conform to the requirements is still lacking. A new measurement will be proposed, where with the use of decision tables we provide a crisp performance measure for the critical justifiability measure.
Martens, David
42e7e141-fb3d-4ead-8e3a-96b39bab65f9
Vanthienen, Jan
6f3d818f-0fce-46fa-966b-160e645caf6d
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Mues, Christophe
07438e46-bad6-48ba-8f56-f945bc2ff934
11 September 2006
Martens, David
42e7e141-fb3d-4ead-8e3a-96b39bab65f9
Vanthienen, Jan
6f3d818f-0fce-46fa-966b-160e645caf6d
Baesens, Bart
f7c6496b-aa7f-4026-8616-ca61d9e216f0
Mues, Christophe
07438e46-bad6-48ba-8f56-f945bc2ff934
Martens, David, Vanthienen, Jan, Baesens, Bart and Mues, Christophe
(2006)
Measuring the consistency with prior knowledge of classification models.
Operational Research Conference 48.
10 - 12 Sep 2006.
Record type:
Conference or Workshop Item
(Paper)
Abstract
Growing popularity of the Internet and innovative storage technology have caused a true data explosion. The overall process of extracting knowledge from this vast amount of data is called data mining. Classification is a subtask of data mining and involves the assigning a data point to a predefined class or group according to its predictive characteristics. The classification problem and accompanying data mining techniques are relevant in a wide variety of domains such as financial engineering, medical diagnostic and marketing.
The performance of a classification model is typically measured by its accuracy; however justifiability is also a major requirement in many data mining applications. Justifiability concerns the extent to which the model is in line with prior domain knowledge. The best known justifiability requirement is the monotonicity constraint (e.g. increasing income should yield an increasing probability of being granted a loan). Several adaptations to existing classification techniques have been proposed to cope with justifiability, yet a measure to identify the extent to which the models conform to the requirements is still lacking. A new measurement will be proposed, where with the use of decision tables we provide a crisp performance measure for the critical justifiability measure.
This record has no associated files available for download.
More information
Published date: 11 September 2006
Venue - Dates:
Operational Research Conference 48, 2006-09-10 - 2006-09-12
Identifiers
Local EPrints ID: 80363
URI: http://eprints.soton.ac.uk/id/eprint/80363
PURE UUID: af9d11ab-721e-4e74-9c54-a3a276b59245
Catalogue record
Date deposited: 24 Mar 2010
Last modified: 08 Apr 2022 01:38
Export record
Contributors
Author:
David Martens
Author:
Jan Vanthienen
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics