Improving Geocoding Match Rates with Spatially‐Varying Block Metrics
Address ranges used in linear interpolation geocoding often have errors and omissions that result in input address numbers falling outside of known address ranges. Geocoding systems may match these input addresses to the closest available nearby address range and assign low confidence values (match scores) to increase match rates, but little is published describing the matching or scoring techniques used in these systems. This article sheds light on these practices by investigating the need for, technical approaches to, and utility of nearby matching methods used to increase match rates in geocode data. The scope of the problem is motivated by an analysis of a commonly used health dataset. The technical approach of a geocoding system that includes a nearby matching approach is described along with a method for scoring candidates based on spatially‐varying neighborhoods. This method, termed dynamic nearby reference feature scoring, identifies, scores, ranks, and returns the most probable candidate to which the input address feature belongs or is spatially near. This approach is evaluated against commercial systems to assess its effectiveness and resulting spatial accuracy. Results indicate this approach is viable for improving match rates while maintaining acceptable levels of spatial accuracy.
Document Type: Research Article
Affiliations: Spatial Sciences Institute, University of Southern California
Publication date: 2011-12-01