Skip to main content

Impact of reference datasets and autocorrelation on classification accuracy

Buy Article:

$71.00 + tax (Refund Policy)

Reference data and accuracy assessments via error matrices build the foundation for measuring success of classifications. An error matrix is often based on the traditional holdout method that utilizes only one training/test dataset. If the training/test dataset does not fully represent the variability in a population, accuracy may be over – or under – estimated. Furthermore, reference data may be flawed by spatial errors or autocorrelation that may lead to overoptimistic results. For a forest study we first corrected spatially erroneous ground data and then used aerial photography to sample additional reference data around the field-sampled plots (Mannel et al.2006). These reference data were used to classify forest cover and subsequently determine classification success. Cross-validation randomly separates datasets into several training/test sets and is well documented to perform a more precise accuracy measure than the traditional holdout method. However, random cross-validation of autocorrelated data may overestimate accuracy, which in our case was between 5% and 8% for a 90% confidence interval. In addition, we observed accuracies differing by up to 35% for different land cover classes depending on which training/test datasets were used. The observed discrepancies illustrate the need for paying attention to autocorrelation and utilizing more than one permanent training/test dataset, for example, through a k-fold holdout method.1

Now at: Cottey College, 6000 W. Austin, Nevada, MO 64772, USA.

Document Type: Research Article

Affiliations: 1: Department of Geosciences,Idaho State University, PocatelloID83209, USA 2: Department of Geology and Geological Engineering,South Dakota School of Mines and Technology, 501 E St Joseph StreetRapid CitySD57701, USA 3: Department of Forest Resources,University of Minnesota, 1530 Cleveland Avenue NorthSt PaulMN55108-6112, USA

Publication date: 10 October 2011

More about this publication?
  • Access Key
  • Free content
  • Partial Free content
  • New content
  • Open access content
  • Partial Open access content
  • Subscribed content
  • Partial Subscribed content
  • Free trial content