Awash in Data (Part II)

This article is the second in a series examining the rapid explosion of data brought on by widespread adoption of laser scanning. Part I looked at the reasons behind the data explosion, while this focuses on current and future methods for managing and working with large sets of data.

Raw data from a scanning system is a collection of 3D points with possibly additional attributes such as color attached. Consider a simple scan of a straight segment of pipe. What may look to the end user like a simple cylinder is really a collection of thousands (or even millions) of unconnected points. Converting this dumb data into useful business intelligence or knowledge is done by having a modeler or drafter create 3D models from the raw data using specialized software interfaces.

Subsequent interaction takes place in CAD (or similar) software with the models, not the raw data. The advantage of this approach is that it reduces the gigabytes of raw data down to megabytes of model information. The smaller and more familiar CAD files are then much easier to work with. However, this approach does not scale effectively. As scanned scenes become larger and denser, manual methods become more cumbersome and expensive. Due to increased complexity, costs for manual processing and quality control increase rapidly and can quickly spiral out of control. Furthermore, the models produced are only as good as the effort that went into creating them. End users are often lulled into a false sense of security by pretty 3D models, but these models are only approximations of reality; relying on them in all cases can lead to trouble.

A few trends are emerging that will address these shortcomings. The first are advancements in storage, computing power, and distributed processing. What seems today like an astronomical amount of data will in a few years be commonplace. In particular, keep an eye on solid-state hard drives. These devices are much faster than mechanical drives and promise to dramatically cut down the latency time for accessing data. This in turn will spur novel software applications and work processes.

Second, manual methods are becoming increasingly machine-assisted. Consider for example scans of a processing plant filled with piping. Previously an operator would take cross-sections through the pipes, fit circles to determine locations and diameters, then enter the information into a CAD package to create a pipe element. Many of today’s products achieve the same results with just a few mouse clicks. They can even create connections and follow lines from scan to scan. Expect software products to grow more powerful and useful.

Some vendors are taking a third approach by attempting to eliminate the user and tout software that completely automates the points-to-model process. Sounds good, but be very skeptical of claims for full automation: a computer may be able to beat a human in chess but interpreting 3D data is much more difficult. It is highly unlikely that the algorithms will work as desired in all circumstances. Even when they do, it’s often not enough. It will always be important to verify, tweak, or audit results. If the software is not designed properly, it may take longer to verify the automated results then it takes to process manually.

The fourth trend is the elimination of the modeling altogether by allowing the user to work with large amounts of point data natively. This not only saves time but provides accurate as-built information quickly, repeatably, and with traceability. Refer to the piping example and assume that a new line is to be added. In this case, there’s little need for a complete 3D model of the existing conditions. Rather, a designer or engineer may query the points to determine the precise locations of the ending tie points. Then the existing plant’s CAD model can be used to select a tentative route for the new installation. The proposed new line can be clash-checked against the point data directly to verify that there are no conflicts. Each of these steps is quick, reproducible and traceable, and there’s a plethora of similar use-cases. Today’s applications utilize compression, visualization and other clever techniques to handle upwards of a billion points directly, but expect this number to increase significantly.

So, where is it all headed? These trends will continue, but it is doubtful any one will dominate. The future is likely the integration of different techniques that make user tasks quicker, more reliable, and easier to integrate into a solid work process. The pitfall to watch out for is software applications centered about large amounts of batch processing in the name of automation. It may sound great, but in practice what often happens is one spends a lot of clock or calendar time waiting for results, then repeating the process when issues surface.

Put bluntly the user should not be eliminated, but rather must become more powerful and efficient. Vendors and practitioners should keep this simple premise in mind when developing and evaluating new and innovative products. Those doing so will keep ahead of the competition and ultimately provide the best value for the end clients.