Accuracy, Precision and all That Jazz

A 779Kb PDF of this article as it appeared in the magazine complete with images is available by clicking HERE

I recently attended a Transportation Research Board subcommittee meeting where we were lamenting the fact that data that looked "real" (such as imagery and high density LIDAR data) lead the observer to make erroneous conclusions about its accuracy. This led the discussion in the direction of what is accuracy, anyway, and how can we convey these ideas to folks who do not normally deal in this arena? Thus, this inaugural Random Points column will address some of the terminology surrounding accuracy and precision. In the next column, we will take these concepts and map them to kinematic LIDAR systems, but first things first.

In most cases it is important to have some idea of the accuracy of the data that you are using. If you intend to do highway engineering, you better know a lot about this aspect of your data! If, on the other hand, you want to look at a route from one city to another, engineering grade accuracy is not going to be required in your analysis.

There is a lot of argument out in the data acquisition community surrounding these topics. It nearly always comes up in arguments about why a particular vendor’s aerial data is better than that stuff on Google Earth. There is also the old saw–"bad data is better than no data." Of course, depending on your use, you better be able to quantify how bad and in what ways.

It’s no wonder that a lot of confusion exists over quantifying accuracy. Every time I have been in a room full of experts, we argue about the specific meaning of the terms. Since I have the floor here for the moment, we’ll go with my descriptions! A caveat however–this article is meant to provide a bit of insight. It is not a vetted technical article and thus you should use my descriptions and analogies with a lot of caution.

We will specifically look at geopositional accuracy as opposed to other accuracy issues such as attributes (i.e. is the `color’ attribute correct?). For a more detailed look at geopositional measurement, I think the Washington State DOT "Highway Surveying Manual" is an excellent read. On the other hand, I find the FGDC standards very dry and light on explanation.

To discuss the geopositional quality of data, I think you need to fully understand the following terms:
Network Accuracy (often called "Absolute Accuracy")
Local Accuracy
Precision
Resolution
Density

There is no other way to do this than to just jump in–so here we go!

Figure 1 provides a nice physical representation of these terms. Note that resolution would be the width of the target rings and density would be the number of hits per target area. Here we are imagining that the bulls eye represents a known location in our datum (maybe we placed some rings around a National Geodetic Survey monument and are taking pot shots with the old Winchester!).

Pick up a ruler and look at it. The fineness (spacing) of the tick marks is the resolution. Similarly, if you have a digital volt meter, the number of digits in the display determines the resolution (it’s a bit more detailed than this, but this is close enough for our purposes). Note that this parameter has nothing to do with `precision’ or `accuracy.’

Precision is a measure of the repeatability of a measurement under identical environmental circumstances (meaning, for example, that if you made repeated length measurements with a steel tape over a number of days where the temperature varied, you would violate the `identical environmental conditions’ restriction). It always speaks to repeating the same measurement multiple times. Since in LIDAR and imaging work, we very seldom do repeated measurements, this is perhaps the most misrepresented term in our work.

Here is a simple experiment that illustrates precision. Take a tape measure and measure the height of a door. Now, using the exact same measurement spot, tape and technique, repeat this 9 more times. Write down your readings to the highest level of resolution supported by your tape (remember resolution). The range of your readings gives you a measure of the precision, not only of your device (the tape) but your system (where you place your eye each time, how close you are at measuring the same spot, how hard you pull on the tape and so on). The assumption here is that you are measuring some constant object so that variation is solely due to you and your device, not the object being measured. In reality, this may or may not be the case!

Now those of you who have studied basic statistics know that if you repeat these measurements enough times (say 30), a plot of the results will produce the ubiquitous Normal (Gaussian, bell, etc.) curve. Precision is statistically quantified as variance (or standard deviation). Now notice that a tape made of steel and a tape, with identical ticks, made of rubber, will have the same resolution but radically different precision.

I hope that you notice that we still have not touched on accuracy. For example, suppose we did our experiment of making 30 repeated measurements of the height of a specific spot on a door with an uncalibrated (more on this later) tape having a resolution of 0.001 meters. Suppose we came up with an average measurement value of 2.000 m, a largest reading of 2.002 m and a smallest reading of 1.097 m (for you statisticians, let’s say we have a standard deviation of 1.8 mm). What can we say at this point? Well, the resolution of our tape is simply a given (we will ignore fudging resolution by linear interpolation). Our measurement precision is quite "good" with a maximum deviation of only 3 times the resolution of our tape. However, we cannot say anything at all about accuracy!

Here’s the problem. I can just go in to my workshop and whack off a piece of electrician’s steel fish tape. I can mark it with measurements (OK, this would be pretty tedious, I agree!) by just eyeballing. and voil, I have a steel tape! It would be quite precise if I did not subject it to temperature variations during my sequence of 30 measurements. However, it would, no doubt, be quite inaccurate when compared to a known length. And this is key–you cannot make a judgment about accuracy without having a `standard’ to which you are making comparisons.

In geopositional work, we are concerned with two types of accuracy. Network Accuracy (which I usually call absolute accuracy but this is a really loose term) talks about how closely your measurements match a known external reference system (what we call a `datum’). Local Accuracy (also often called relative accuracy) deals with the accuracy of the measurement of metric units with respect to a standard. By this we mean if you measure a length or an area, how `close’ are you to the true value? Note that you can very accurately measure the distance between two fixed points yet be clueless as to the location of the points relative to some outside reference system (again, the `datum’).

This is the case with our measurement of the door. If I calibrated my tape by comparing to a `standard’ meter, I could then use it to very accurately and precisely (the precision coming from my construction of the tape as verified in my repeated measurements experiment) measure the height of the door. Yet I still would have no idea of where in the `world’ the two end points of my measurements were located. This is an example of very good relative (local) accuracy yet very poor network (absolute) accuracy.

A figure that I lifted directly from Wikipedia, Figure 2, provides a more statistical view of accuracy versus precision. Note here that the distance of the mean (average) of the repeated measurements speaks to the accuracy whereas the `spread’ of the measurements (variance) speaks to the precision.

So finally we are left with the term, density. This parameter is not related to accuracy, precision or resolution. In LIDAR work, it would be the number of points per unit area. In imagery work, it would be the number of pixels per unit area. Note that this is often called `resolution.’ In imagery work, if using an array sensor, it may be roughly synonymous with resolution. When using scanning LIDAR systems, it is seldom synonymous with resolution.

Am I splitting hairs here? No, not at all. If you followed the above discussion, you will realize that precision is related to the number of measurements and the total size of the sample is the resolution. It has nothing to do with area, density or point spacing. In looking back over this paragraph, I have confused even myself! Basically what I am saying is that it is entirely possible to limit the scanning density of a LIDAR system to roughly 1 point per square meter yet have an available horizontal resolution of a few centimeters.

With this foundation in terminology, we will address how these factors play in to LIDAR data in the next issue. Stay tuned!

Lewis Graham is the President and CTO of GeoCue Corporation. GeoCue is North America’s largest supplier of LIDAR production and workflow tools and consulting services for airborne and mobile laser scanning.

A 779Kb PDF of this article as it appeared in the magazine complete with images is available by clicking HERE