Skip to main content

Agricultural Case Studies of Classification Accuracy, Spectral Resolution, and Model Over-Fitting

Buy Article:

$29.00 plus tax (Refund Policy)


This paper describes the relationship between spectral resolution and classification accuracy in analyses of hyperspectral imaging data acquired from crop leaves. The main scope is to discuss and reduce the risk of model over-fitting. Over-fitting of a classification model occurs when too many and/or irrelevant model terms are included (i.e., a large number of spectral bands), and it may lead to low robustness/repeatability when the classification model is applied to independent validation data. We outline a simple way to quantify the level of model over-fitting by comparing the observed classification accuracies with those obtained from explanatory random data. Hyperspectral imaging data were acquired from two crop‐insect pest systems: (1) potato psyllid (Bactericera cockerelli) infestations of individual bell pepper plants (Capsicum annuum) with the acquisition of hyperspectral imaging data under controlled-light conditions (data set 1), and (2) sugarcane borer (Diatraea saccharalis) infestations of individual maize plants (Zea mays) with the acquisition of hyperspectral imaging data from the same plants under two markedly different image-acquisition conditions (data sets 2a and b). For each data set, reflectance data were analyzed based on seven spectral resolutions by dividing 160 spectral bands from 405 to 907 nm into 4, 16, 32, 40, 53, 80, or 160 bands. In the two data sets, similar classification results were obtained with spectral resolutions ranging from 3.1 to 12.6 nm. Thus, the size of the initial input data could be reduced fourfold with only a negligible loss of classification accuracy. In the analysis of data set 1, several validation approaches all demonstrated consistently that insect-induced stress could be accurately detected and that therefore there was little indication of model over-fitting. In the analyses of data set 2, inconsistent validation results were obtained and the observed classification accuracy (81.06%) was only a few percentage points above that obtained using random data (66.7‐77.4%). Thus, our analysis highlights a potential risk of model over-fitting and emphasizes the importance of testing for this important aspect as part of developing reliable and robust classification models.

Keywords: Biotic stress detection; Crop stress; Hyperspectral imaging; Model classification.; Parsimony principle; Reflectance-based detection

Document Type: Research Article


Affiliations: University of Western Australia, School of Animal Biology, UWA Institute of Agriculture, 35 Stirling Highway, Crawley, Perth, Western Australia 6009, Australia

Publication date: November 1, 2013

More about this publication?

Access Key

Free Content
Free content
New Content
New content
Open Access Content
Open access content
Subscribed Content
Subscribed content
Free Trial Content
Free trial content
Cookie Policy
Cookie Policy
ingentaconnect website makes use of cookies so as to keep track of data that you have filled in. I am Happy with this Find out more