Nonparametric and non-linear models and data mining in time series: a case-study on the Canadian lynx data
Nonparametric regression methods are used as exploratory tools for formulating, identifying and estimating non-linear models for the Canadian lynx data, which have attained bench-mark status in the time series literature since the work of Moran in 1953. To avoid the curse of dimensionality in the nonparametric analysis of this short series with 114 observations, we confine attention to the restricted class of additive and projection pursuit regression (PPR) models and rely on the estimated prediction error variance to compare the predictive performance of various (non-)linear models. A PPR model is found to have the smallest (in-sample) estimated prediction error variance of all the models fitted to these data in the literature. We use a data perturbation procedure to assess and adjust for the effect of data mining on the estimated prediction error variances; this renders most models fitted to the lynx data comparable and nearly equivalent. However, on the basis of the mean-squared error of out-of-sample prediction error, the semiparametric model Xt=1.08+1.37Xt-1+f(Xt-2)+et and Tong's self-exciting threshold autoregression model perform much better than the PPR and other models known for the lynx data.