Spectrum Spatial (SSA/LIM)

Expand all | Collapse all

Using the Distance function in the LIM Spatial Calculator stage

  • 1.  Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-19-2020 11:36
    I am trying to replicate the output for a MI Pro Origin - Destination distance calculation in Enterprise Designer using the LIM and control stages.

    This flow fails to calculate any distances.

    Does anyone have any idea to run a distance calculation in Enterprise Designer? Please reach out with any follow up questions.

    Ross Owens


  • 2.  RE: Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-20-2020 05:12

    How do the settings for your Distance stage look?

    Peter Horsbøll Møller
    Distinguished Engineer
    Pitney Bowes Software & Data

  • 3.  RE: Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-20-2020 11:29
    Hey Peter,

    This is how the distance stage looks.


    Ross Owens
    Savannah GA

  • 4.  RE: Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-25-2020 11:02
      |   view attached

    If I see this correctly, what you are trying to do is to get the distance between 2 sets of points; that is, to output distances  of every point in table1 to every point in table 2  ( N x M ).  You also have a sorter in your picture but I am not sure of how you want to use that.
    You also mentioned that these points had an origin in MapInfo Pro. If these 2 assumptions are correct, this is how I would solve that problem. I can attach a dataflow example.  It will be for version 2019.1.

    The first step is to take your 2 tables (.tab files I assume) and register them with the Spatial repository in Spatial Manager. This is called "creating named tables" for the data. The 2 table can be referenced where they live and will be in a repository path of your choosing and naming.

    I would then use 2 spatial stages, Read Spatial Data and Query Spatial Data.  In my example, which I will share the exported dataflow, I am using 2 tables in our sample data so you should be able to load and run the same flow

    Read will read each record from one table​ (Select * from "/DistanceProject/2020/Ross" ) which is wherever you put the named tables.  You can choose which fields to read out from the list and checkboxes below. Here's a picture of using * but selecting the fields you want. You can also use an explicit column name in the SQL.


    Note that I have renamed the geometry fro "Obj" to "StartPt" and I have decided that while name of the city and state are interesting identifying information, the population is not so I unchecked it.
    You may also  want some sort of key or identifier so when you output the distances you can identify which 2 points this distance represents. Spectrum Spatial supports a default key called "MI_Key" which is the .tab file rowed in this case. But it could be anything else you have in your data.
    Then you connect it to Query Spatial Data where you can do  the next part where you also calculate the distance.
    For example here I have named the geometry EndPt for clarity (I could have just left it at obj as I have "startpt" to distinguish it). I then saw that I would have a field name clash with "State" from the first table so I renamed it "CapitalState" and then I removed the population field but left the FIPs_Code but no real reason.

    I also changed the Distance output to DistanceMiles  


    The output of this could be to a text file. I also could have updated the results into a .tab file possibly creating a line between the start and end point. Again depends on what you want to do but in this flow, I output to a text file.

    A sample of output. Note that the first set of results is all from New York and sorted by distance. That is because New York is the first record in the Ik city file.

    52.313,TRENTON,NJ,34,NEW YORK,NY
    134.007,DOVER,DE,10,NEW YORK,NY
    135.263,ALBANY,NY,36,NEW YORK,NY
    190.915,BOSTON,MA,25,NEW YORK,NY
    214.559,CONCORD,NH,33,NEW YORK,NY
    287.925,RICHMOND,VA,51,NEW YORK,NY
    331.361,AUGUSTA,ME,23,NEW YORK,NY
    421.696,RALEIGH,NC,37,NEW YORK,NY
    475.346,COLUMBUS,OH,39,NEW YORK,NY
    560.815,LANSING,MI,26,NEW YORK,NY
    595.728,COLUMBIA,SC,45,NEW YORK,NY

    Further down in the results are other cities also sorted by distance.

    668.942,BOISE,ID,16,LOS ANGELES,CA
    792.863,SALEM,OR,41,LOS ANGELES,CA

    I used one of the dataflow conversion options so that the distance only has 3 decimal places.

    I will attach the exported dataflow (.df) file in a zip file.

    I will try your approach or you can share your flow. I think having 2 read from files is confusing and you would not be able to determine the order. 



    Eric Blasenheim
    Spectrum Spatial Technical Product Manager
    Troy, NY


  • 5.  RE: Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-25-2020 13:19


    Thanks for walking me through your method! This makes a lot of sense and answers my question.

    As a follow up questions, do you think there is a way to do this without using Named Tables in the Spatial Manager? For example; starting from two geocoded .csv files, then creating point then appending the closest distances.

    Thanks again.

    Ross Owens

  • 6.  RE: Using the Distance function in the LIM Spatial Calculator stage

    Posted 03-25-2020 14:15
    Clearly the first part of my flow, the Read Spatial Data could be easily replaced by a Read from file and a Create Point. I would not be a spatial guy if I did not warn about coordsys mishaps. While not so common anymore when most geocoders return values in WGS84 LL I still see the reversed XY (the Latitude/Longitude vs Longitude/Latitude war) or the cases where the values are in some other coordinate system completely! So there is a part of me that considers all text and csv files with coordinates the devil's work! :) Using a spatial system for spatial data always works well.
    That said, that first part is pretty easy to replicate.
    The rest of it gets trickier. The spatial stage is not only iterating over every record in the second table and calculating the distance from the passed in point but it is ordering the results and doing that server side. So much simpler to think of and much faster to run.
    Plus the sorted results make sense in that they are sorted for each start point rather than sorted for the entire run.
    I have not tried your idea of 2 read from files. I have occasionally seen this in pictures so I guess it works but without trying it I will say that the order will be near impossible to control and the sorter stage will need some kind of grouping key so that it knows what is a group and what is not. The sorter waits for all the data I think because it has no idea when to sort.
    I can give it a whirl a bit later and will let you know what I find.
    Doing these things in a database or in a spatial system is just much more intuitive to me.
    If automating is your concern, you could set up table 1 and 2 once, then create flows that read the .csv and write to an existing spatial table (replacing all the results).
    Then when those flows are done, use a flow like mine on the updated tables.  Have you looked at the job executor?

    Eric Blasenheim
    Spectrum Spatial Technical Product Manager
    Troy, NY