The University of Southampton
University of Southampton Institutional Repository

Benchmarking classification models for software defect prediction: a proposed framework and novel findings

Lessmann, Stefan, Baesens, Bart, Mues, Christophe and Pietsch, Swantje (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings IEEE Transactions on Software Engineering, 34, (4), pp. 485-496. (doi:10.1109/TSE.2008.35).

Record type: Article


Software defect prediction strives to improve software quality and testing efficiency by constructing predictive classification models from code attributes to enable a timely identification of fault-prone modules. Several classification models have been evaluated for this task. However, due to inconsistent findings regarding the superiority of one classifier over another and the usefulness of metric-based classification in general, more research is needed to improve convergence across studies and further advance confidence in experimental results. We consider three potential sources for bias: comparing classifiers over one or a small number of proprietary datasets, relying on accuracy indicators that are conceptually inappropriate for software defect prediction and cross-study comparisons, and finally, limited use of statisti-cal testing procedures to secure empirical findings. To remedy these problems, a framework for comparative software defect prediction experiments is proposed and applied in a large-scale empirical comparison of 22 classifiers over ten public domain datasets from the NASA Metrics Data repository. Our results indicate that the importance of the particu-lar classification algorithm may have been overestimated in previous research since no significant performance differ-ences could be detected among the top-17 classifiers.

Full text not available from this repository.

More information

e-pub ahead of print date: 23 May 2008
Published date: July 2008
Keywords: complexity measures, data mining, formal methods, statistical methods
Organisations: Management


Local EPrints ID: 63006
PURE UUID: bb7eb90c-683c-48b2-aa8f-ffbc3b288fd8

Catalogue record

Date deposited: 14 Oct 2008
Last modified: 17 Jul 2017 14:19

Export record



Author: Stefan Lessmann
Author: Bart Baesens
Author: Christophe Mues
Author: Swantje Pietsch

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton:

ePrints Soton supports OAI 2.0 with a base URL of

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.