Evaluating the uncertainty caused by Post Office Box addresses in environmental health studies: A restricted Monte Carlo approach
Abstract:A Monte Carlo approach is used to evaluate the uncertainty caused by incorporating Post Office Box (PO Box) addresses in point-cluster detection for an environmental-health study. Placing PO Box addresses at the centroids of postcode polygons in conventional geocoding can introduce significant error into a cluster analysis of the point data generated from them. In the restricted Monte Carlo method I presented in this paper, an address that cannot be matched to a precise location is assigned a random location within the smallest polygon believed to contain that address. These random locations are then combined with the locations of precisely matched addresses, and the resulting dataset is used for performing cluster analysis. After repeating this randomization-and-analysis process many times, one can use the variance in the calculated cluster evaluation statistics to estimate the uncertainty caused by the addresses that cannot be precisely matched. This method maximizes the use of the available spatial information, while also providing a quantitative estimate of the uncertainty in that utilization. The method is applied to lung-cancer data from Grafton County, New Hampshire, USA, in which the PO Box addresses account for more than half of the address dataset. The results show that less than 50% of the detected cluster area can be considered to have high certainty.
Document Type: Research Article
Affiliations: Department of Geography, Dartmouth College, Hanover, NH 03755
Publication date: January 1, 2007