Handling missing data in diaries of alcohol consumption
Missing data can rarely be avoided in large scale studies in which subjects are requested to complete questionnaires with many items. Analyses of such surveys are often based on the records with no missing items, resulting in a loss of efficiency and, when data are missing not at random, in bias. This paper applies the method of multiple imputation to handle missing data in an analysis of alcohol consumption of the subjects in the Medical Research Council National Survey of Health and Development. The outcomes studied are derived from the entries in diaries of food and drink intake over seven designated days. Background variables and other responses related to alcohol consumption and associated problems are used as collateral information. In conventional analyses, subpopulation means of quantities of alcohol consumed are compared. Since we are interested in the harmful effects of alcohol, we make inferences about the percentages of those who consume more than a given quantity of net alcohol. We assess the contribution to the analyses made by the incomplete records and outline a more integrated way of applying multiple imputation in large scale longitudinal surveys.