MapInfo Pro

Expand all | Collapse all

Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

  • 1.  Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

    Pitney Bowes
    Posted 07-29-2019 02:16

    This is the first article in a series that will describe the MapInfo Virtual Raster. In this article, I explain the basic concepts and introduce you to some of the capabilities of virtual rasters, as well as exploring some of the processing and visualisation possibilities the concept supports. These capabilities are supported by MapInfo Pro version 17.03 onwards.

    A "physical" raster is a dataset that is comprised of a large number of rectangular cells arranged in a regular mesh. Each cell has one or more data values associated with it. For example, the data value might be a color in which case the raster is referred to as an image. Or, the data value might be a floating-point number representing an estimation of some physical property in which case we tend to describe the raster as a grid. There are few limitations on what kinds of data can be associated with a raster cell.

    A "virtual" raster is just like a physical raster, except that instead of a huge file containing all the data values associated with all the cells, there is just a description of a process that the raster engine can use to obtain those data values on the fly – when they are asked for. So a virtual raster is a description of a raster and the actual values are obtained or fabricated in real-time as they are required – whether that is for rendering, querying or for processing.

    A physical raster has a coordinate system, a cell size and it has an extent. A virtual raster also has these definitions. A physical raster has bands (to allow multiple data values to be stored in each cell) and it may also have fields (to group bands) and events (to store data acquired at different times). A virtual raster also has these concepts. A physical raster will generally have some statistics associated with it that can record the physical extent of the valid cells, the number of valid and invalid cells, summary band statistics, band distribution statistics (histograms) and spatial statistics. For an MRR format raster, these statistics are stored in the raster and are computed from the "base level resolution" data. Virtual rasters also have statistics, but in general they will only be an estimate of the true "base level resolution" statistics.

    In MapInfo Pro, all rasters have a continuous series of resolution levels. The "base" resolution level is designated zero and this is the source data which is used to populate the raster. "Overview" resolution levels number from 1 upwards and each level has a cell size two times larger than the level below it. "Underview" levels, which are generated by interpolation from the base level, have resolution levels numbering from -1 downwards and each level has a cell size two times smaller than the level above it. Virtual rasters also have this resolution level structure.

    We currently support two kinds of virtual rasters. The first is the "VRT" format which is a part of the GDAL open source project and is closely aligned with the QGIS platform. MapInfo Pro uses the GDAL raster drivers and supports VRT via these drivers. The second is our own "MVR" format - MapInfo Virtual Raster – which is the subject of these articles.

    An MVR is an XML file. Like any other raster in MapInfo Pro, there will also be a TAB file associated with the MVR and there may be a GHX file. The GHX stores rendering information and (in some cases) statistics. You will not see any PPRC or PERC files associated with an MVR file. These overview caches are not used.

    An MVR file describes the structure of the raster and also describes the data sources that will be called upon to populate cell data in the virtual raster. There are three data sources – Raster sources, Operation sources and Rendering sources.

    "Raster" sources are simple to understand – they are simply other rasters. Generally, they will be physical rasters although they could be virtual rasters. A raster source is usually just a single raster but it can also be multiple rasters (which will have similar structures and content).

    "Operation" sources are raster processing operations. The processing operation will have its own raster sources to which it will apply one or more processing operations and the output of the processing operation will be piped onwards to populate the virtual raster.

    "Rendering" sources invoke the rendering engine to render an algorithm. It will consume other raster sources and produces imagery that is then piped onwards to populate the virtual raster.

    In MapInfo Pro 17.03 there are three ways to generate MVR files. Firstly, there is a new tool called "Virtual Raster" which provides a user interface mechanism to create an MVR that references multiple raster sources. Secondly, in the image warp tool, there is a new option to create an MVR as output from this tool. This MVR uses an operation source to execute the image warping on the fly. Thirdly, you can write the MVR file yourself and then load the file via the standard raster opening mechanism.

    For more information regarding the exposed capabilities of MVR in MapInfo Pro 17.03, please see this article by Sweta Shukla - Working with MapInfo Virtual Rasters.

    Virtual rasters provide advantages and disadvantages over physical rasters.

    An MVR that uses raster sources is typically used to create multi-banded rasters from single banded physical rasters. A common application would be visualisation and processing of satellite imagery where the source data is supplied as single banded raster files. There are compelling reasons to want to leave the original source data in its original form rather than convert it via some processing sequence. It saves time, storage space and money.

    An MVR that uses raster operation sources can be used to apply processing pipelines to huge rasters and to see the results in real-time. It can save you the effort and expense of processing huge rasters and it allows you to detect errors in the processing pipeline cheaply. Note that virtual rasters can be chained together to create a multi-stage processing pipeline that can execute highly complex workflows on the fly.

    An MVR that uses rendering sources can be used as a standard product to make it easy to produce complex rendering algorithms for data such as satellite imagery. It can remove the requirement for the data scientist to create rendering algorithms and allow them to get straight on with interpretation and processing.

    The key disadvantage is performance – a physical raster will always be faster than a virtual raster where data must be acquired and prepared on the fly. I believe you will find this performance difference minor, but you can make your own assessment of the performance of MVR as you explore its capabilities. Note that a virtual raster can always be converted into a physical raster.

    In the next article, I will use an MVR to combine single banded multi-spectral satellite imagery into a multi-banded virtual raster. Then, we will consume this virtual raster in other virtual rasters to illustrate how this makes rendering satellite imagery easy.



    ------------------------------
    Sam Roberts
    Engineer, MapInfo Pro Advanced (Raster)
    Australia
    ------------------------------


  • 2.  RE: Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

    Posted 07-31-2019 06:48
    Hi Sam,

    especially the support for .vrt-Files is great since it provides a kind of "seamless table" that also other software packages like qgis can read.
    But are there any size limits in working with .vrt-files in MiPro?
    i just created a new .vrt file via "gdalbuildvrt.exe" from the gdal distribution, that gets displayed without problems in QGis, - also using the gdal libraries.
    When i try to load that file in MiPro 17.03 it either loads without problems or crashes MiPro by a chance of 50:50.
    The Library gdal19.dll in the MiPro-directory would suggest, that it is a rather old gdal version, that's used here, - QGis is using 2.04 and the current stable release is 3.01.
    Is it planned to update that MiPro gdal version to a more recent one?
    The .vrt file i created references 3.898 8-Bit GeoTiff files somewhere around 10 GB in total size.

    Any ideas about this issue?

    Thanks and regards
    Stefan


  • 3.  RE: Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

    Pitney Bowes
    Posted 07-31-2019 20:33
    Hi Stefan,

    A VRT file will be mounted by the GDAL native driver (shipped as part of MapInf Pro Advanced, but also in standard MapInfo Pro). This driver uses the GDAL raster libraries to acquire data and pipes it "as is" to MapInfo. So, if MapInfo crashes when you display a VRT file it is likely that the underlying GDAL library is causing the failure. MapInfo is not imposing any size limits on the process, or any other constraints.

    The DLL gdal19 is not used by the Advanced driver - look in the "Raster" directory instead. It uses gdal202.dll. Our current GDAL version is behind QGIS and behind the latest release. We will endeavor to get this updated to the latest GDAL version for the October 2019 Pro release, especially if it improves stability.

    I do see crashes caused by GDAL when displaying VRT's. VRT supports image warping (as does our own MVR), but I see crashes related to this very frequently. In fact, we updated the GDAL version not so long ago in the hope of seeing improved stability in VRT rendering (but it made no difference).

    VRT has a "Merge" capability which allows you to emulate a seamless table and display many small rasters all at once as a single raster. On Windows, you can only open a certain number of files simultaneously at once. You cannot, for example, open all  of your 3898 TIFF files at once. VRT allows you to do this by using a file handle pool. It actually only opens about 100 rasters at once, and closes and reopens them continuously to get data through the driver. This is inefficient. (Note that MVR also has a "Merge" capability which is currently undisclosed. We do not use a file handle pool, so the number of rasters you can merge is limited. But our merge works faster and better than VRT and doesn't crash).

    If you have MapInfo Pro Advanced, then one solution you can pursue is to use the "Merge" operation to join all your TIFF files back together and save them as an MRR format raster. Once done, you will have no display problems. The penalty you pay is the processing effort and storage cost but if your source data does not change frequently then this cost is only incurred once, or infrequently.

    Another solution we might pursue in the future is to use an MVR to merge all the thousands of source rasters on the fly, but use a MRR for the overviews at some lower level of resolution. So the MVR would serve data from the source TIFF's until you zoom out to some level (at which point the system would have to access a large number of source TIFF's) and at that point the MVR would start serving data from the pre-prepared MRR. The MRR would only be, perhaps, 1% of the size of the source TIFF's. This would provide an efficient and cheap solution, especially if your source rasters are updated frequently.

    Thanks for getting in touch,
    Sam.


    ------------------------------
    Sam Roberts
    Engineer, MapInfo Pro Advanced (Raster)
    Australia
    ------------------------------



  • 4.  RE: Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

    Posted 08-01-2019 03:53
    Good morning Sam,

    and thank you very much for this long and detailed explanation.

    Just to add some more information that might help: this vrt file doesn't use warping all "tiles" are from the same crs and align perfectly.
    I tried to open that file in different environments and from the first look it seems, as if IO has some impact: On the fastest machine i have where everything is stored on nvme disks, MiPro crashes 100% when trying to open that vrt file, on my 5 year old laptop i got that 50:50 chance and the chances are even higher when trying to open that file in MiPro running inside a vm accessing the file from an usb stick.
    I couldn't get QGis to crash in all of these environments, so it's either that a little bit more recent gdal library that's more stable or something else causing the crash.

    The merge and mrr conversion is a possible way, but as i wrote, the charm of vrt would be, that you could use that file in MiPro and other software as well, - that's not possible with mrr files, as far as i know, - or do you plan to release that file format spec to the public so that others are able to implement at least a reader?

    Thanks again and
    best regards
    Stefan


  • 5.  RE: Virtual Raster Deep Dive – Part 1 – What is a Virtual Raster?

    Pitney Bowes
    Posted 08-01-2019 19:32
    We are in the process of developing a GDAL driver for MRR which will provide read access for visualisation and processing. This work is very close to completion - it takes a lot of effort to get everything ready across Windows, different flavours of Linux etc. Once that work has been integrated into the GDAL code base it will start to become available in software like QGIS and ArcGIS, and other packages that use GDAL to access rasters. So you will be able to view MRR format rasters in other packages in the future.

    I can't explain why you would see different failures of VRT with different storage types. It sounds like we need to do some testing. We do not test VRT much because it is up to GDAL and we have no control over that. But it is always possible that something is going wrong in our code. We do have one fix in the upcoming 17.04 release for our GDAL driver, but I don't think that will impact on this particular problem. I saw that you opened a support case so we can continue the conversation through that channel.

    ------------------------------
    Sam Roberts
    Engineer, MapInfo Pro Advanced (Raster)
    Australia
    ------------------------------