LIDAR Magazine

USDA Exceeds Petabyte Milestone for Processing Lidar Data

How did the United States Department of Agriculture (USDA) address the challenge of processing an enormous volume of lidar data, modernize the output, then manage and catalog the data for its stakeholders?

The answer lies in powerful software for geospatial data processing, robust enough to QA/QC lidar data, extract information, and derive products from large datasets. The solution: LP360 lidar and photogrammetry 3D point-cloud software from GeoCue.

The team at GeoCue was instrumental in helping USDA design and implement a streamlined production workflow with LP360. This collaboration enables the USDA lidar catalog to be efficiently reviewed, updated to modern standards, and then published through a lidar server web portal for easy access by USDA stakeholders. Furthermore, this initiative is supported by an education and training program designed for USDA staff, which facilitates the quick and easy use of the lidar data catalog in their daily operations.

A trove of geospatial data

The program had its genesis in late 2015. GeoCue was approached by a branch of USDA to help with managing and updating a large volume of lidar data. The government organization maintains a geospatial data library for all its field offices and practitioners. Included in the library is a diverse collection of data essential for agricultural management and crop planning, plus a wealth of geospatial information for agricultural analysis.

A significant portion of the library’s resources is dedicated to elevation data. USDA maintains, in its native point-cloud format, an extensive catalog of traditional lidar data collected with manned aircraft. When it approached GeoCue, USDA had received a copy of the USGS Center for Lidar Information Coordination and Knowledge (CLICK) back catalog of lidar data, which was a comprehensive collection of all the lidar data collected by US government agencies dating back to 2002.

Martin Flood, Vice President of Special Projects at GeoCue, was one of the initial team members who helped spearhead the program. According to Flood, “All of this data was not formatted in a manner that the USDA field offices could use. The challenge for us was to help them utilize LP360 to organize, format, and manage these very, very large data sets.”

At that time, USDA was already utilizing LP360 software for processing some geospatial data. Help was sought, however, to update and modernize data, some of which is now more than 20 years old, aiming to enable various classifications, address overlap classes, model key points, and manage the files efficiently.

USDA Sample Point Cloud

Lidar Server map view of a site in Blount County, Alabama

The data processing solution

The GeoCue team dedicated several weeks to working on-site alongside USDA experts, examining their data catalog, evaluating workflows, and pinpointing tasks for updates and modernization of older datasets. They encountered numerous challenges, including outdated LAS file formats, incorrect coordinate systems and a sizable portion of the data that required additional processing to meet USDA requirements.

“That’s where we really utilized our LP360 processing software,” explained Flood. “We developed a workflow that would allow someone to take older data sets and modernize them. At the same time, USDA was interested in adding some specific value-added work, including height segmentation and building classification. Again, this is where LP360 shines. The software can classify points above the ground based on both height and specific elevation bands set by the USDA. For building classification, what they really were interested in was locating buildings above a certain size on a property.”

LP360 was the perfect processing software for the job as it has the tools and features to analyze the data. Essentially, that was the scope of work and the challenge for GeoCue: demonstrate that LP360 can process the old catalog of valuable lidar data and update it to modern standards so it could be used more effectively and processed specifically for the USDA stakeholders.

“As a local software developer, we’ve fine-tuned LP360 throughout the years, specifically crafting an efficient workflow tailored to USDA’s processing needs,” remarked Frank Darmayan, CEO of GeoCue. “We’re proud to stand as the chosen software partner for this critical work, and we are committed to relentlessly improving our software to further boost efficiency.”

The USDA Lidar Dashboard.

Managing a mountain of data

The program has changed over the past nine years of using LP360 to manage the data. Initially, the program managed the legacy data sets and updated the data, which is an important archive. It represents an invaluable historical record of geospatial data, the terrain, the elevation, and the technology used in early lidar mapping efforts dating back to 2002.

“Over the years, LP360 has been able to modernize much of the older data sets, Flood noted. “But now USDA has partnered with USGS through its 3D Elevation program, which conforms to modern data standards. This data has been collected more recently, but still needs height and building classifications to meet USDA stakeholder needs. So, LP360 continues to be the best tool to provide the value-added for the USDA-specific classifications.”

The result of this work is a catalog of data sets that are published on a website for easy access by stakeholders. Using LP360 was instrumental in leveraging and processing the massive amounts of data. David Glenn, the Program Manager at GeoCue, helped facilitate and manage a workflow to accommodate the size of the data.

“From an operations point of view, the average size of the early data sets was at the 300 or 400 gigabyte level,” explained Glenn. “Over time that has grown significantly and today the average size is closer to a terabyte. So, we had to develop and organize a workflow in LP360 to manage and process over thirty terabytes of data every month. The challenge for us is utilizing hardware that will behave well, because some of the processes run for not only hours, but days and weeks, and this software must stay up and continuously run without failure.”

Crossing the petabyte milestone

The setup includes several devices connected on a 10 GB network to facilitate the transfer of large datasets between machines. These powerful machines can run multiple processes simultaneously for days on end, ideal for handling extensive data processing tasks.

Furthermore, over the past five or six years, the hardware configuration has adapted not only to manage the increasing volume of data but also to process datasets that are not just larger in size but contain much denser point clouds.

As the complexity of the data escalated, Glenn had to help refine methods for USDA to manage the data. “Often, this meant dividing large datasets into smaller segments, to be processed separately, and then reassembled,” he said. “This evolution in the process and setup reflects our ongoing adaptation to the changing landscape of data volume and quality.”

LP360 recently achieved the remarkable milestone of processing over a petabyte of data for USDA, highlighting an extraordinary accomplishment. Over the past five years, not only has the pace of processing this data consistently increased, but LP360 is now transferring thirty terabytes of processed data to USDA each month, with very few quality control issues.

The software’s efficiency and reliability have significantly strengthened the partnership with USDA, building a deep trust in LP360’s capability to manage and process large volumes of data smoothly and effectively, with minimal disruptions or setbacks.

Darmayan acknowledges that LP360 software is the key to success. “USDA has established a bold objective to make lidar data available to its users across the United States. We are honored that LP360 was selected to handle the quality assurance, quality control, and classification of this data. The milestone of processing over one petabyte of data underscores the scalability of our software for handling extensive aerial lidar data processing tasks.”

The LP360 interface.

Continued performance and reliability

A key achievement beyond the sheer volume of data processed is that all the data has been consistently formatted through a standardized workflow, resulting in a reliable format readily usable by USDA stakeholders. This underscores the program’s success in not only managing substantial amounts of data but also in ensuring its utility and accessibility for those who rely on it.

“From the start, what we’ve really focused on for the USDA is a very consistent, well-understood end product,” explained Flood. “It’s getting this whole petabyte of data into a very well-defined, consistent state. By processing in LP360, you can pick up any of these data sets or go to any part of that whole petabyte of data, and you’ve got a USDA lidar data set that meets its requirements and standards.”

GeoCue’s involvement in the program has proven immensely valuable, particularly in managing and developing workflows for large volumes of data within LP360. The experience has provided the development team with deep insights into enhancing LP360, tailoring it more effectively for geospatial users engaged in similar tasks.

According to Flood, “The process has been highly educational, enriching our understanding and significantly influencing our product development and planning strategies. By facing the same challenges and issues as our users, we’ve gained a firsthand perspective on their needs, making this experience not only beneficial but also instrumental in aligning our efforts with customer requirements. This journey has been a rewarding one, contributing substantially to our growth and the improvement of our services. To be able to do so while also helping to preserve the historical record of lidar data collection in the USA is very satisfying to the entire team.”

Acknowledgement

Special thanks and consideration to the GeoCue team that supports USDA’s lidar technical services, including Martin Flood, David Glenn, Blaine Bartow, Katie Duncan, Becky Glenn, Brett Henderson, Jack Filipovic, Santhi Jakka, Mark Toland, and Elton Gladstone.

Exit mobile version