Data Sandbox

Expand all | Collapse all

Layers and Layers of (Spatial Data) Goodness

  • 1.  Layers and Layers of (Spatial Data) Goodness

    Pitney Bowes
    Posted 06-21-2019 12:03


    Imagine your typical birthday cake:  2 cake layers, separated by a layer of cream filling and topped with icing, maybe some berries and candles. Now imagine spatial data – it's better for you than cake anyways (sugar is bad for you) and it has layers too.    

    In data/mapping we think and work in layers all the time. We typically source, produce, and perform quality control based on individual data layers.  A layer of POIs, another of streets, yet another of parcels.  Entire products can be a single layer. We might source a 'layer' of property attributes from a company x, then a 'layer' of consumer data from company y.

    We sourced a homemade birthday cake similarly (we buy some cake mix, buy some icing ingredients, some berries from a local farm and candles). But when the candles are blown out we then eat the entire cake, not as layers, but as the entire cake.  Sure, everyone has a favorite layer of the cake, but the quality of the cake depends on all those layers being thoughtfully and properly integrated.

    Similarly, spatial data rarely consume one layer of data at a time; instead they integrate multiple layers to solve business problems.  While we cannot ignore the sourcing, production and quality control of the data elements in an individual layer, we need to do that next (frankly more difficult) step and check the interplay between the layers as a consumer uses them.  To do this properly, we must think like that data consumer who uses multiple layers at a time, and that's where things tend to break down.  Organizational lines (or product lines) frequently are drawn along data layers, and you can bet that folks are going to focus on their part of the business. This means that over time quality gaps will develop between the cooperation of the layers. You don't want the customer that tries to integrate the layers to be the one to discover that the layers are not properly integrated… 

    I offer 3 types of checks to help detect cross-layer issues: 

    1. Spatial coherence between the layers. Do the data elements in the layers appear to have acceptable absolute and relative accuracy between them? Relative accuracy issues increase in criticality as you add layers, as all the layers need to work well together.   For example, do your buildings, address points and POIs spatially align, or do they look like 3 disjointed layers?  What do your roads, water polygons and parcels combined reveal?  Specifications and checks that are designed to check spatial relationships such as these should be considered.
    2. ID interoperability between the layers. Do you have shared IDs between the layers? Ideally you would and that allows them to be easily linked to each other.  If you do, how well do the shared IDs match between data layers?  What is the completeness between them?    Do they reveal completeness issues in an individual layer?
    3. Correctness issues revealed by the combination of the layers. A lack of standardization and duplication of data elements in multiple layers both can be easily seen.  For example, an address delivered on each of 3 layers (for example, address points, POIs, and property attributes) may be different for each layer and reveal correctness issues.  A parcel with a data value indicating vacant land combined with a layer showing a building and a POI associated with that parcel shows something is wrong. Specifications and checks that are designed to check correctness issues such as these should be considered.

    In the end, both the cake eater & the data consumer really care only about the end product – not just the individual layers.  We should always keep that foremost in mind.

    What other checks do you think we should do between spatial data layers?

    ------------------------------
    Tom Gilligan
    Pitney Bowes Software, Inc.
    White River Junction, VT, USA
    ------------------------------


  • 2.  RE: Layers and Layers of (Spatial Data) Goodness

    Pitney Bowes
    Posted 06-23-2019 18:56
    Great article Tom!
    When dealing with extracted data from imagery, devices etc and then aligning this with existing datasets it is incredibly hard to achieve spatial coherence - particularly when you can't move the spatial object you know is in the wrong place (such as legal parcel boundaries). I've seen smart processing assist in this, using ID interoperability to link between layers and quality codes to enable a users understanding of the relationship.
    I've also seen a complete avoidance of using multiple layers to improve the quality of data, I agree that it does not provide a culinary delight.....

    ------------------------------
    Gerry Stanley
    Pitney Bowes Australia Pty Ltd
    Macquarie Park
    ------------------------------



  • 3.  RE: Layers and Layers of (Spatial Data) Goodness

    Pitney Bowes
    Posted 06-24-2019 09:51
    Those are some great points Tom.  Another item to add to the discussion is the critical nature (or lack thereof) of the items QC checks find for these three areas.  Your Correctness example is a good one for an example of a critical error.  A parcel that is expected to be empty due to the parcel attribute but actually contains data from both the POI and building layers could have a pretty noticeable effect for the consumers.  An example of a difference that is likely less critical would be if the phone number differed between the POI and building layers.
         Ideally, the QC checks for each layer are divided into categories of importance to a consumer's use of the product.  Adding another QC category for "how the entire cake tastes" looks like a very helpful thing for the consumer.


    ------------------------------
    Jeffrey Howe
    Quality Assurance Lead
    Pitney Bowes
    White River Junction VT
    ------------------------------