In the previous installment, we examined some of the intermediate steps that can be used in "classifying" LIDAR data where the objects of interest are very similar to the surrounding points. We specifically looked at the problems associated with classifying at-grade (meaning same elevation as the surroundings) roads. In this edition, I will discuss a problem that seems a bit easier (but in fact, is not) -extracting objects that exhibit a "geometry" signature.

Automated Feature Extraction (AFE) has been the holy grail of image processing since digital imaging first became affordable (well, at least to the Government!) in the 1960’s. With the advent of both the digital computer (think IBM 360) and digital imaging, it was widely thought that feature extraction via the emerging field of Artificial Intelligence (AI) was just around the corner. Well, that was nearly 50 years ago and sadly, not a lot of progress has been made. When LIDAR came on the scene in the mid 1990’s, we in the industry believed that adding the third dimension to imagery would result in major breakthroughs in this field. Of course, we should have known better – 3D images had been around virtually since the beginning of digital imaging via 3D extraction using correlators from overlapping images.

All that said, steady progress is being made to automate the extraction of 3 dimensional features from LIDAR and correlated image data. In fact, a fairly new algorithm called Semi-Global Matching (SGM) has led to a resurgence in interest in AFE within the imaging community so I look forward to new algorithms on a much more frequent basis. As an example, we will examine the extraction of planar surfaces. Figure 1 depicts the results of a first run, non-optimized automated classification.

Figure 1: Roofs (red), Ground (brown) and "other stuff" (green)

Buildings are a popular target for AFE because they generally present a simple geometry – planar surfaces. Of course, the extraction rapidly becomes complex when these surfaces become complex but the average US home with minimum roof slope of 27 degrees should be easy to extract. Figure 2 illustrates an area containing several fairly simple roof structures. It is rendered as a Triangulated Irregular Network (TIN), colored by absolute elevation.

Figure 2: Unclassified area containing roofs (colored by height)

There are many different algorithms for extracting building roofs but all use some variant of detecting a planar surface. A brute force approach is to start at a place in the point cloud and select 3 points. Form a plane from these points. Now select a 4th point and see how will it fits the constructed plane. It turns out that this approach grows geometrically with the number of points to be tested and hence does not yield a practical algorithm.

(Warning – mathematical section!) – A better approach (and quite common) is to select a region of points and compute their covariance matrix in terms of X, Y and Z coordinates. Computing the eigenvalues of this matrix provides a measure of the "spread" of the variances in three mutually orthogonal (right angles to one another) directions. If the eigenvalues are all of the same order of magnitude, the points are fairly uniformly dispersed and probably do not represent a plane. If, however, one eigenvalue is close to zero and the other two are not, the data are exhibiting a planar distribution. Furthermore, the eigenvectors will provide the orientation axes of the plane. Those who have done work with classifying imagery data will recognize this approach as a variant of the common Principal Component Analysis (PCA) algorithm. Even though I understand what is going on in this algorithm, it is still never ceases to amaze me that this works! It is just one of those wonderfully elegant solutions.

Once a planar patch is detected, candidate points in the planar region are tested against the "seed" plane in an effort to classify all points that belong to the roof segment. This is the basic approach that we use in our own LP360 for ArcGIS LIDAR processing software. Of course, there are complications. Testing for the existence of a planar surface requires some level (actually a depressingly high level!) of heuristic tuning; in other words, guessing. The first and obvious problem is that the ground itself is a nice planar surface in many regions. Thus the algorithms generally proceeds after detecting the ground surface. We use a simple test that omits points that are less than the height we specify as the minimum height at which we expect to find planar roof surfaces. Another simple test is to limit the planar surfaces to the angle of roofs that you anticipate finding. Of course, this test is of limited utility when finding flat roofs!

Suppose I have a tree overhanging the roof? What part of the point cloud goes to what class (see Figure 3)? This is very easy for a human operator to differentiate but surprisingly difficult for an automated algorithm.

Figure 3: Canopy overhanging a roof structure (green = vegetation, red = planar surface, orange = ground)

We use a number of tests for sorting this out. One is a simple "threshold" test; how far is the candidate point, in the normal direction (perpendicular to), from the theoretical planar surface?

These sorts of tailored algorithms are developed for the various objects to be extracted. Objects that follow well known and understood rules are the easiest to model whereas those that occur in nature tend to be the most difficult. Hence it is quite easy to extract piping in a static LIDAR scanning project yet fairly complex to accurately extract "ground" in an airborne project.

As the acquisition of coincident imagery becomes more common in LIDAR projects, we will have the ability to use algorithms that combine the "signature" attributes of both imagery and LIDAR to improve extraction algorithms. My prediction is that we will continue to see incremental improvements in this arena but few major breakthroughs. Thus, for classified data outputs of very high classification integrity, much manual QC and cleanup will be required. That will be the subject of the next installment.