I was recently asked if we could provide historical risk information for an insurance application. Typically the GIS data we provide is a static representation of the real world. For example, every year we release a data set that represents US coastlines. Users of this data get a new set of coastlines that they typically just drop in to their systems and overwrite the ones they previously had. Insurers may keep old coastlines in an archive for historical rating processes but mortgage underwriters and others who use the coastline data just replace their current version with the most recent release.
What's missing in the replacement process I just described is that it doesn't allow us to capture and summarize the changes in data over time. If the coast line has moved inward 100 feet due to hurricane storm surge we don't capture that change in latest data set. And unless we keep the old coastline on the system and do some form of change detection we can miss it. Similarly, when a post code is split into two new postal delivery areas it can be hard to track what old polygon has been split. We often deliver these types of changes in change log files and release notes but it is hard for user to integrate all of the new information without going through each change in a feature by feature manner.
The really big advantage that is missed by not handling old data is the identification of trends. If, for example, an insurer were able to determine that a particular part of the coast line is receding in a given direction they can price policies accordingly. We have captured trends in demographic and geodemograpic data but not so much with physical features. In the demographics and geodemographics data the trends are captured as statistical attributes. Physical features are tougher because we don't tend to describe a coastline has moving 100 feet north-west as an attribute. Trend information can help companies plan and for oncoming changes, knowing where the optical fiber is buried can help planners know where to deploy 5G resources.
One solution is to just supply the old data sets and let people figure out how to identify changing trends in the data themselves. But that is a lot of data to keep around and what methods will be used to identify changes? Another is to pick trends and capture them with each release of the data. This approach would save storage space and file management time. Some data sets that we deliver actually have the temporal information included. Our weather data, for example, has the date and time of each weather event recorded as attributes and records events from 1995. Each cell tower record can have the date it was built or modified captured as an attribute. But changes to physical features are harder to identify and describe. Keeping records of the changes from one release to the next is ok but what if we want to see changes over a period of years that might span several data releases? Our PSAP boundary data has shown a steady decrease in the number boundaries over recent years. Are there easier ways to identify these trends and represent them in the datasets that we provide?
One part of the solution is to have unique IDs for each geographic feature in the data. If we can get the IDs to be persistent then we can quickly identify features that have changed over time.