Spatial variation in risk of disease: a nonparametric binary regression approach
A common problem in environmental epidemiology is the estimation and mapping of spatial variation in disease risk. In this paper we analyse data from the Walsall District Health Authority, UK, concerning the spatial distributions of cancer cases compared with controls sampled from the population register. We formulate the risk estimation problem as a nonparametric binary regression problem and consider two different methods of estimation. The first uses a standard kernel method with a cross-validation criterion for choosing the associated bandwidth parameter. The second uses the framework of the generalized additive model (GAM) which has the advantage that it can allow for additional explanatory variables, but is computationally more demanding. For the Walsall data, we obtain similar results using either the kernel method with controls stratified by age and sex to match the age-sex distribution of the cases or the GAM method with random controls but incorporating age and sex as additional explanatory variables. For cancers of the lung or stomach, the analysis shows highly statistically significant spatial variation in risk. For the less common cancers of the pancreas, the spatial variation in risk is not statistically significant.