Model-based inference for k-nearest neighbours predictions using a canonical vine copula
Abstract:The k-near neighbours (k-NN) technique combines field data from forest inventories and auxiliary information for forest resource estimation at various geographical scales. In this study, auxiliary data consisting of Landsat 5 TM satellite imagery and terrain elevations were used to perform k-NN imputations of plot-level above ground biomass. Following the model-based inference, a superpopulation model consisting of a canonical vine copula was constructed from the empirical data, and new samples were generated from the model and used for k-NN predictions. The method used herein allows constructing the sampling distribution for the imputation errors and for assessing the statistical properties of the k-NN estimator. Using a data-splitting procedure, the copula-based approach was assessed against pair-bootstrap resampling. The imputations were performed using k (the number of neighbours) = 1 and by using optimal k values selected according to a bias-minimizing criterion. The empirical coverage probabilities of the confidence intervals constructed using the copula-based approach were closer to the nominal coverages. The improvements were due to significant bias reduction, while the standard errors were higher compared to the bootstrap. Still, the root mean squared error was significantly reduced. The best results were obtained using the copula approach and k-NN imputations with k=1.
Document Type: Research Article
Affiliations: Department of Ecology and Natural Resource Management,Norwegian University of Life Sciences, Ås, Norway
Publication date: April 1, 2013