In the previous article we outlined the steps that imagery needs to go through before being usable by end users and how being able to do this in near real time requires high performance computing. In this article we will go through each step to understand the complexity of the processing involved and what in the past this has been at best a semi-automated process.
The important breakthrough that Wolfgang Lck, of Forest Sense cc was able to demonstrate at the PCI Geomatics Reseller conference is full automation of the entire process. This means that end user products, whether orthorectified images, digital surface models or classified vegetation maps are available much faster and at lower cost.
From Wolfgang’s perspective the key to enabling this to happen is an image processing platform that provides a modular architecture, robust scripting, "big data" management, support for a broad range of sensor models (cameras),support for Linux as well as Windows, comprehensive image processing functionality, the ability to incorporate custom leading edge algorithms -integrated with an accessible user interface and API. In addition because this level of automation pushes the envelope, responsive support from the vendor is also essential.
In the example Wolfgang used a scripting language, Python or EASI, to implement a work flow primarily comprised of PCI-Geomatica modules, but also custom components implementing leading edge algorithms from current research.
The first step is format conversion. The data often comes in some proprietary format which the processing engine may not support. A converter is required to bring it into a format that the processing engine supports.
The next step is radiometric corrections and artifact removal. Forest Sense supports a wide range of systems put together by emerging space nations, such as African or Middle Eastern countries. In the example, you can see two lines are artifacts. New gains and biases have to be calculated for those lines and applied to the image. Other examples are random noise detection and dropped line removal.
Another problem is systematic noise. A fast Fourier transform (FFT) reveals a certain pattern in the high frequency domain which is characteristic of systematic noise. We can detect that automatically, do a FFT, create a mask to filter it out, and then do a reverse FFT to reconstruct the corrected image.
The next correction is the flat field correction that ensures that the sensors are calibrated correctly relative to one another.
The next step is band alignment. The different bands need to be aligned because they are acquired at slightly different angles by different sensors.
The next step is total atmosphere reflectance correction. What the sensor actually gets is relative voltage, and that is converted to relative radiance, which needs to be normalized to total atmosphere reflectance. This allows images acquired at different times, or from different sensors to be compared and to be used for quantitative analysis. But the effect of the atmosphere is still there. We can do atmospheric correction, but it is a bit of a black art and applied incorrectly can introduce artifacts. For this reason, many people in the domain prefer to use total atmospheric reflectance data because they can trust it.
At this point the image can be classified.
The first step in classifying the image is haze, water and cloud classification. To do this requires both blue and red bands. These bands correlate with one another very highly, but the blue band is affected more strongly by haze and water vapour. What is called a clear sky vector can be applied to the image to calculate the haze vector and remove haze. The right side shows the haze that is removed from the image.
The next step is collection of ground control points (GCPs) so that accurate geolocations can be calculated. Unlike many systems which collect GCPs even over water or cloud which introduces errors, Wolfgang only collects ground control points over land.
GCPs enable automated orthorectification as well as extracting digital surface models (DSMs) automatically from stereo pairs.
The next step is preclassification. Spectral preclassification requires topographic normalization using a digital terrain model (DTM). The image has to be preclassfied into different surface types that have different BRDF (bidirectional reflectance distribution function) properties. Light reflects in different directions at different intensities on different surfaces at different wavelengths. This has to be done using bands that are not affected by topography, which makes this step quite complicated.
The slide shows a Landsat image of a hilly terrain with a lot of shade and exposed slopes which need to be normalized topographically. Traditionally people have used a range of vegetation indices, such as the normalized difference vegetation index (NDVI). From comparison with the DTM, it is possible to see that the NDVI is affected by topography. Although it is a ratio between bands, it is still possible to see shady and sunny slopes. The reason for this is that there is a lot of scattering in the blue band and shorter wavelengths but no scattering in the red and longer wavelengths so the ratio between the red band and the infrared band is not a pure ratio because the red band is affected by scattering.
The TVI index was developed to beresistant to topography. Of course it doesn’t work in areas of full shade. A DSM is used to calculate which areas are in complete shade so they can be masked out. At this point all of these techniques can be applied to generate an automatic spectral preclassification of the image which then can be used fortopographic normalization. The skylight adaptive topographic normalization is based on techniques that Wolfgang developed using all of these preprocessing steps. In the corrected image it is still possible to still see some shady areas, but most of the shadow has been removed.
The next step is image compositing and mosaicking and the generation of Level 4 (geoinformation) products. These products might include vegetation indices, biophysical parameters, masks, pixel-based classifications or object-based classifications.
The final step is to package the data and deliver the data to the customer, typically by transmitting the imagery to a pickup point where it is catalogued. At this point the product is available for the customer to collect from the pickup point.
Wolfgang demonstrated that downloading raw data and transforming it into something usable by a farmer, construction contractor, or insurance adjustor requires many steps and a lot of processing. This example of a typical image processing work flow shows just what nearly a petabyte of the raw imagery captured daily must go through to create something usable by the end user customer (people or machines). To get this to customers in near real-time is incredibly computationally intensive and involves rapidly processing massive volumes of data. This is why algorithmic performance is critical and why multiple processors with hyperthreading and most recently GPUs (graphics processing units) are being harnessed to improve throughput.