Calibrated imputation in surveys under a quasi-model-assisted approach
We propose to use calibrated imputation to compensate for missing values. This technique consists of finding final imputed values that are as close as possible to preliminary imputed values and are calibrated to satisfy constraints. Preliminary imputed values, potentially justified by an imputation model, are obtained through deterministic single imputation. Using appropriate constraints, the resulting imputed estimator is asymptotically unbiased for estimation of linear population parameters such as domain totals. A quasi-model-assisted approach is considered in the sense that inferences do not depend on the validity of an imputation model and are made with respect to the sampling design and a non-response model. An imputation model may still be used to generate imputed values and thus to improve the efficiency of the imputed estimator. This approach has the characteristic of handling naturally the situation where more than one imputation method is used owing to missing values in the variables that are used to obtain imputed values. We use the Taylor linearization technique to obtain a variance estimator under a general non-response model. For the logistic non-response model, we show that ignoring the effect of estimating the non-response model parameters leads to overestimating the variance of the imputed estimator. In practice, the overestimation is expected to be moderate or even negligible, as shown in a simulation study.