This is the first article in a series that will describe the MapInfo Virtual Raster. In this article, I explain the basic concepts and introduce you to some of the capabilities of virtual rasters, as well as exploring some of the processing and visualisation possibilities the concept supports. These capabilities are supported by MapInfo Pro version 17.03 onwards.
A "physical" raster is a dataset that is comprised of a large number of rectangular cells arranged in a regular mesh. Each cell has one or more data values associated with it. For example, the data value might be a color in which case the raster is referred to as an image. Or, the data value might be a floating-point number representing an estimation of some physical property in which case we tend to describe the raster as a grid. There are few limitations on what kinds of data can be associated with a raster cell.
A "virtual" raster is just like a physical raster, except that instead of a huge file containing all the data values associated with all the cells, there is just a description of a process that the raster engine can use to obtain those data values on the fly – when they are asked for. So a virtual raster is a description of a raster and the actual values are obtained or fabricated in real-time as they are required – whether that is for rendering, querying or for processing.
A physical raster has a coordinate system, a cell size and it has an extent. A virtual raster also has these definitions. A physical raster has bands (to allow multiple data values to be stored in each cell) and it may also have fields (to group bands) and events (to store data acquired at different times). A virtual raster also has these concepts. A physical raster will generally have some statistics associated with it that can record the physical extent of the valid cells, the number of valid and invalid cells, summary band statistics, band distribution statistics (histograms) and spatial statistics. For an MRR format raster, these statistics are stored in the raster and are computed from the "base level resolution" data. Virtual rasters also have statistics, but in general they will only be an estimate of the true "base level resolution" statistics.
In MapInfo Pro, all rasters have a continuous series of resolution levels. The "base" resolution level is designated zero and this is the source data which is used to populate the raster. "Overview" resolution levels number from 1 upwards and each level has a cell size two times larger than the level below it. "Underview" levels, which are generated by interpolation from the base level, have resolution levels numbering from -1 downwards and each level has a cell size two times smaller than the level above it. Virtual rasters also have this resolution level structure.
We currently support two kinds of virtual rasters. The first is the "VRT" format which is a part of the GDAL open source project and is closely aligned with the QGIS platform. MapInfo Pro uses the GDAL raster drivers and supports VRT via these drivers. The second is our own "MVR" format - MapInfo Virtual Raster – which is the subject of these articles.
An MVR is an XML file. Like any other raster in MapInfo Pro, there will also be a TAB file associated with the MVR and there may be a GHX file. The GHX stores rendering information and (in some cases) statistics. You will not see any PPRC or PERC files associated with an MVR file. These overview caches are not used.
An MVR file describes the structure of the raster and also describes the data sources that will be called upon to populate cell data in the virtual raster. There are three data sources – Raster sources, Operation sources and Rendering sources.
"Raster" sources are simple to understand – they are simply other rasters. Generally, they will be physical rasters although they could be virtual rasters. A raster source is usually just a single raster but it can also be multiple rasters (which will have similar structures and content).
"Operation" sources are raster processing operations. The processing operation will have its own raster sources to which it will apply one or more processing operations and the output of the processing operation will be piped onwards to populate the virtual raster.
"Rendering" sources invoke the rendering engine to render an algorithm. It will consume other raster sources and produces imagery that is then piped onwards to populate the virtual raster.
Virtual rasters provide advantages and disadvantages over physical rasters.
An MVR that uses raster sources is typically used to create multi-banded rasters from single banded physical rasters. A common application would be visualisation and processing of satellite imagery where the source data is supplied as single banded raster files. There are compelling reasons to want to leave the original source data in its original form rather than convert it via some processing sequence. It saves time, storage space and money.
An MVR that uses raster operation sources can be used to apply processing pipelines to huge rasters and to see the results in real-time. It can save you the effort and expense of processing huge rasters and it allows you to detect errors in the processing pipeline cheaply. Note that virtual rasters can be chained together to create a multi-stage processing pipeline that can execute highly complex workflows on the fly.
An MVR that uses rendering sources can be used as a standard product to make it easy to produce complex rendering algorithms for data such as satellite imagery. It can remove the requirement for the data scientist to create rendering algorithms and allow them to get straight on with interpretation and processing.
The key disadvantage is performance – a physical raster will always be faster than a virtual raster where data must be acquired and prepared on the fly. I believe you will find this performance difference minor, but you can make your own assessment of the performance of MVR as you explore its capabilities. Note that a virtual raster can always be converted into a physical raster.
In the next article, I will use an MVR to combine single banded multi-spectral satellite imagery into a multi-banded virtual raster. Then, we will consume this virtual raster in other virtual rasters to illustrate how this makes rendering satellite imagery easy.