Matching spatial data sets: a statistical approach
Although the acquisition and maintenance of spatial data is very costly and time consuming very often the same objects of the real world are captured in many different data models, at different acquisition times, with different quality characteristics or at different scales. This situation will become intensified when more and more digital spatial data are offered by using internet technologies. Integration methods are needed to take advantage of the characteristics of more than one data set. These advantages could be, for example, new applications for which the data models had not been originally designed, higher reusability, improvement of the quality, or cost minimization of data acquisition. In this paper a relational matching approach for integration of spatial data from different sources is introduced. The research work was performed on street centrelines which were captured in different data models. The approach is based on statistical investigations between the data of two data models, and can be used in the same way for other data models. The matching problem is mapped onto a communication system, and measures, derived from information theory, are used to find an optimal solution. These measures can also be used for calculating local and global quality characteristics of the matching result.