Bin N Grid: A simple program for statistical filtering of point cloud data

This month I will discuss a simple statistical filter that can be used on LIDAR data to remove vegetation, noise, etc. We developed code to do this as a project for our DTM class at Oregon State. I have posted an executable and sample dataset at: http://web.engr.oregonstate.edu/~olsen/BinandGrid.zip

Note that you will need to download the Microsoft visual studio C++ Redistributable for this to work if you do not have MS Visual Studio installed on your computer. You can download that here:

http://www.microsoft.com/downloads/en/details.aspx?FamilyID=a7b7a05e-6de6-4d3a-a423-37bf0912db84

To run the program, you simply drag and drop a text file into the executable and follow the prompts to determine cell size and the desired output file format. The following is a discussion of the overall methodology of the program:

1. Import data. The imported data text file has the following rules:
a. "#" or "x" designate a comment line.
b. The data should have X Y Z values on each line. Everything after the Z value on the line is ignored. It should work for most delimitations (Tab, space, comma, etc). The sample file is X Y Z R G B I format, so the R G B and I values are ignored.

2. Calculate extents of data. Note that for efficiency, this is done at the same time as data loading to avoid having to loop through the data twice. Also, reading data off the hard drive and parsing it into a data structure is a slow process, so the processor can simultaneously evaluate the data extents as each point is read, making it more efficient.

3. Create a grid. The grid is defined by the coordinates lower left corner, number of rows and columns, and a cell size. The user can specify the cell size.

4. Loop through each point and assign it to the grid cell that it falls in.

5. For each grid cell, calculate statistics (min, max, mean, standard deviation, etc) and assign the desired value to the grid cell.

6. Export the data as a grid (floating point grid [.flt] for ArcGIS) or point (X,Y,Z text file) dataset. The floating point grid (.flt and .hdr file) can be imported into GIS by going to the tool box->Conversion->to raster-> float to raster, which will turn them into ESRI grids.

The program gives you several options for gridding. For example, if you wanted a digital surface model, you could run the max or mean plus a multiple of the standard deviation mode. If you want to smooth noisy data then run the mean mode. If you want to filter out vegetation, etc., then run the min or mean minus a multiple of the standard deviation mode. Regardless of the mode selected, all data are constrained to be within the min or max of the current grid cell. (Thus if the mean n standard deviations is less than the min, the min value is used for the final value.

For fun, in this version, I added in a mode that will do the mean minus two standard deviations when the standard deviation is above a threshold input by the user (e.g. vegetation) and the mean when the standard deviation is below the threshold. The mean minus two standard deviations generally works well at filtering vegetation to get ground; however, in the case of areas without vegetation, it would artificially place the ground lower than it is.

The figures show the sample dataset of a road intersection on the OSU campus before and after the filter. The data are colored by elevation from lowest (blue) to highest (red). Note the red and yellow streaks from cars passing by during the scan. These are eliminated with the statistical filter in the second image, which shows only ground points.

Feel free to download the program and test it out. I would be happy to hear any comments on how it works (or does not!). As I stated earlier in this article, this is a simple filter and may not do everything that you want. Improved results can be achieved by adding in additional components (see Chapter 4 of Airborne and Terrestrial Laser Scanning, edited by Vosselman and Maas). Also, there are several commercial software packages that implement advanced ground filtering algorithms worth checking out if this is something you do routinely. Open Topography also has a ground filtering program:

http://www.opentopography.org/index.php/blog/detail/tools_for_lidar_point_cloud_filtering_classification

Now for my soap box speech – I feel strongly that programming knowledge is very important for todays geospatial professional. Data sets become larger, technology changes, and software sometimes cannot keep up. This leaves a big burden on someone processing the data. In many cases, you can write code to do a tedious data processing task in less time than it would take you to do it manually once. Sometimes, the time savings may not be a lot the first time, but after having to do the tedious work many times for many datasets, you can see a great payoff.

This month I will discuss a simple statistical filter that can be used on LIDAR data to remove vegetation, noise, etc. We developed code to do this as a project for our DTM class at Oregon State. I have posted an executable and sample dataset at: http://web.engr.oregonstate.edu/~olsen/BinandGrid.zip