Assessment of a New Three-Group Software Quality Classification Technique: An Empirical Case Study

Authors: Taghi Khoshgoftaar1; Naeem Seliya2; Kehan Gao3

Source: Empirical Software Engineering, Volume 10, Number 2, April 2005 , pp. 183-218(36)

Publisher: Springer

Buy & download fulltext article:

OR

Price: $47.00 plus tax (Refund Policy)

Abstract:

The primary aim of risk-based software quality classification models is to detect, prior to testing or operations, components that are most-likely to be of high-risk. Their practical usage as quality assurance tools is gauged by the prediction-accuracy and cost-effective aspects of the models. Classifying modules into two risk groups is the more commonly practiced trend. Such models assume that all modules predicted as high-risk will be subjected to quality improvements. Due to the always-limited reliability improvement resources and the variability of the quality risk-factor, a more focused classification model may be desired to achieve cost-effective software quality assurance goals. In such cases, calibrating a three-group (high-risk, medium-risk, and low-risk) classification model is more rewarding. We present an innovative method that circumvents the complexities, computational overhead, and difficulties involved in calibrating pure or direct three-group classification models. With the application of the proposed method, practitioners can utilize an existing two-group classification algorithm thrice in order to yield the three risk-based classes. An empirical approach is taken to investigate the effectiveness and validity of the proposed technique. Some commonly used classification techniques are studied to demonstrate the proposed methodology. They include, the C4.5 decision tree algorithm, discriminant analysis, and case-based reasoning. For the first two, we compare the three-group model calibrated using the respective techniques with the one built by applying the proposed method. Any two-group classification technique can be employed by the proposed method, including those that do not provide a direct three-group classification model, e.x., logistic regression and certain binary classification trees, such as CART. Based on a case study of a large-scale industrial software system, it is observed that the proposed method yielded promising results. For a given classification technique, the expected cost of misclassification of the proposed three-group models were significantly better (generally) when compared to the technique’s direct three-group model. In addition, the proposed method is also evaluated against an alternate indirect three-group classification method.

Keywords: Software quality prediction; three-group classification; discriminant analysis; decision trees; case-based reasoning; expected cost of misclassification

Document Type: Research article

DOI: http://dx.doi.org/10.1007/s10664-004-6191-x

Affiliations: 1: Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431, USA, Email: taghi@cse.fau.edu 2: Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431, USA, Email: nseliya@cse.fau.edu 3: Empirical Software Engineering Laboratory, Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL, 33431, USA, Email: kgao@cse.fau.edu

Publication date: 2005-04-01

Related content

Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content

Text size:

A | A | A | A
Share this item with others: These icons link to social bookmarking sites where readers can share and discover new web pages. print icon Print this page