Cloud-hosted Web-based Lidar Data Storage and Dissemination Solutions—Part 1

Figure 1: Sanborn GeoData Cache™ application workflow: Users can login from tablets/laptops, search for the datasets, view them online in web-based interfaces and download them, if needed.

The increasing use of lidar in multiple applications has resulted in a flood of temporal lidar data and has created the logistical challenge of storing, managing and retrieving data for distributed end users. The Sanborn Map Company, Inc. (Sanborn), has developed a cloud-based solution for managing the data discovery and dissemination of lidar from various data sources, ranging from data storage, data discovery, and feature extraction through to data visualization over the web. The end users can easily access and manipulate the data using a standard web-browser and no longer have to worry about storing, accessing, and processing. Since the entire pipeline is hosted and managed on the public cloud, it is infinitely scalable, responsive to increased user loads or processes in real-time and has a 99.9% Service Level Agreement (SLA) with guaranteed uptime.

In this article, we present the complete lidar data pipeline as managed in our spatial database framework and hosted in the public cloud, for a typical statewide customer, as an example. This is a two part paper: in this article, the framework and use case for the lidar data storage and dissemination is discussed; the technical challenges regarding implementation, along with the recommended best practices, will be presented in Part 2.

Figure 2: Architecture Diagram for Sanborn GeoData Cache™, as deployed in Google Cloud Platform (the application itself is cloud-agnostic)

Current Bottlenecks
Data management is a challenge for most enterprise customers. A landmark study by the National Institute of Standards and Technology (NIST) found that 40% of engineering time is spent locating and validating information. Lidar data management for enterprise customers can become complicated when different data sources are used to generate new products or multiple teams share data creation. For example, if “Team A” created a digital elevation model (DEM) from a quality level 2 (QL2) dataset without documenting its source metadata, it can quickly make the downstream work done by “Team B” lose track of the limitations of such a product. Many corporations solve such problems by strictly documenting the processes and workflows through control process documents. However, there are limitations to that approach as maintenance and documentation itself becomes a significant overhead effort. Making the data more accessible can have positive cascading effects across multiple departments within an organization.

The smartphone app industry has shown that making dedicated applications for specific tasks without requiring the end user to learn proprietary software can have enormous positive advantages. For example, a photo-editing app that allows users to perform a set of useful actions such as mirroring images, cropping out faces or even changing specific colors, can be empowering for a casual user who would not know how to do this in Photoshop®, does not have access to Photoshop®, and/or is not necessarily interested in obtaining Photoshop®. Using the same paradigm, allowing end users to perform actions with the lidar data like elevation profiling and point density calculations without needing access to a specialized desktop/workstation with the necessary software installed, is game-changing.

Historically data storage has been a stubborn bottleneck challenge for enterprise customers interested in data management. The advent of the public cloud has been revolutionary as organizations can now deploy solutions in the cloud, thus eliminating concerns about future scalability and SLA uptime. Global load-balancing technology can be implemented to help distribute incoming requests across pools of instances in multiple regions, to achieve maximum performance, throughput and availability. These load-balanced virtual machine clusters can automatically scale the capacity up and down based on the end user traffic, providing a reliable performance for the organization. Object-level permissions and encryption during transit and at rest ensure the data’s confidentiality, and protect against potential information security concerns.

Use Case
Consider the following scenario: An end user needs lidar data for a given area of interest (AOI). The user logs into the Sanborn GeoData Cache™ website, sees the different datasets available and narrows down the search to the desired AOI, based on geographic and attribute filters. Using the Sanborn Lidar Web-Viewer™, the user can analyze the quality of the data and perform useful geoprocessing tasks such as elevation profiling, measurements, vector overlays, or changing the point cloud display method. Once satisfied, the user can clip the data and download it for immediate offline consumption. The workflow available to the end user is summarized in Figure 1.

Sanborn’s Solution
Sanborn has a long history of developing and implementing technological innovations to solve real-world geospatial challenges. In order to reduce the overhead required for hosting and maintaining the data, Sanborn has created an end-to-end solution to facilitate data discovery and dissemination for our customers. Sanborn GeoData Cache™ product allows users to search through lidar datasets to identify and access the data in which they are interested. The end user is then able to (a) download the data; (b) view a raster version of the data served as a WMS/WMTS in any OGC-compliant software; and/or (c) view it online in Sanborn Lidar Web-Viewer™. Sanborn Lidar Web-Viewer™ can integrate with vector layers (point, polyline or polygon shapefiles/geodatabases), which allows users to manipulate the point cloud in 3D, measure features, add annotations and export screenshots. An architecture diagram of the Sanborn GeoData Cache™ application is shown in Figure 2.

Using Sanborn GeoData Cache™, the users can query for data AOIs that meet their specific conditions. Once a selection is made, users are presented with a list of attributes for all polygons selected. Features can be sorted based on geographic boundaries or attribute filters, and users can set the geographic boundaries by entering criteria into the search text box, such as city, county, or other tags; latitude/longitude coordinates; or bounding box coordinates. They can also apply the bounding box of the current map-view by selecting it on the map. The data can also be searchable through other GIS layers, using layer attributes, or through a spatial query using route indexing, coordinates, project-site polygons, etc. Each of the AOI footprints is displayed in a list on the results sidebar and as a wireframe on the map interface. Restrictions can be applied on individual users to allow the download of a predetermined number of tiles at a time, monthly download limits, or even require admin-approval before bulk downloads.

The attribute filters applied on top of a boundary filter can include:

  • Agency Selection: Users can select for the agency which collected the data
  • Data Quality: Users can select data by specifying the quality parameters
  • Collection Method: Users can select data by entering the collection method used for acquiring the data (terrestrial or aerial)
  • Cloud Cover Percentage: Users can filter the data by entering a maximum amount of cloud coverage allowed in a feature
  • File Format: Users can select the format in which the downloadable files will be provided
  • ISO Category: Users can select the ISO category desired for a selected data type
  • Licensing: Users can select for data licensing permissions in the selected data
  • Acquisition Date Range: Users can select features based on a date range (all features outside the range will be removed)

Authentication & Authorization
Sanborn GeoData Cache™ solution allows end-users to access data through browser-based authentication. Multi-factor authentication (MFA) can be offered as an additional factor to conventional logins to prevent unauthorized access. MFA is used for both authentication (the action of verifying a user’s identity) and authorization (the action of verifying a user has permission to do something). The additional factor to conventional logins can be done through sending a one-time password by SMS, or email, or through apps like Google Authenticator and Duo.

The Sanborn GeoData Cache™ application has been designed to provide different levels of authenticated access based on user credentials, e.g., state employees could be given access to see more data than the general public. The data is also encrypted in transit and at rest, which ensures the data can be accessed only by the authorized users. Sanborn GeoData Cache™ has a sophisticated backend dashboard that allows administrators to create/remove users, grant users privileges (such as admin rights), reset passwords, and monitor user history.

Sanborn Lidar Web-Viewer™
Sanborn Lidar Web-Viewer™ allows end users to view and manipulate the datasets hosted on Sanborn GeoData Cache™, and the platform supports all the typical lidar deliverable formats. It allows elevation profiling; area, height and volume measurements; and point density calculations. Users can even change the point cloud display based on the elevation, intensity, RGB or classification.

Vector layers (point, polyline or polygon shapefiles/geodatabases) can be overlaid on the lidar data to allow users to compare the alignment of the vector data relative to the lidar datasets, manipulate the point cloud in 3D, or annotate individual points or features. Standard deliverable products such as DEM or Intensity Rasters can also be shown in the web-application viewer. The end user is then able to download the data of interest (including any applied analytics or manipulations), or stream a raster version of the data (served as a WMS/WMTS) for later consumption in any OGC-compliant software.

Summary
The native lidar format files (LAS) are just one of the many deliverables that are part of a typical lidar project delivery. There are other deliverables as well such as DEM files, vector layers, intensity images, ortho-imagery (in some cases), metadata, survey files, etc. Sanborn GeoData Cache™ can ingest all these file formats for storage and distribution, and Sanborn Lidar Web-Viewer™ can support the same lidar data file formats for viewing, analysis, and manipulation. Together, the system can also serve as a quality control engine for the lidar deliverables, where the quality controller can flag specific issues in the lidar data that can be then addressed by the vendor.

The combined solution of the Sanborn GeoData Cache™ and the Sanborn Lidar Web-Viewer™ offers the following advantages to the end user:

  • No need for different plugins
  • No need for a powerful Desktop/Workstation, nor specialized software
  • No need for large local storage
  • No need for diverse feature extraction tools

With the advent of new generation lidar sensors, the appetite for denser lidar datasets is increasing exponentially. This trend will significantly increase the data storage needs and the band-width requirements for data dissemination. In light of this trend, solutions such as Sanborn GeoData Cache™ and Sanborn Lidar Web-Viewer™ allow for greater capacity for handling multiple, large datasets. The challenges involved in implementing such a solution along with few examples of recommended best practices will be discussed in Part 2 of this article series.

References
Gallaher, M.P., O’Connor, A.C., Dettbarn, Jr., J.L., and Gilday, L.T., Cost Analysis of Inadequate Interoperability in the U.S. Capital Facilities Industry, NIST GCR 04-867, National Institute of Standards and Technology, Gaithersburg MD, 20899, https://doi.org/10.6028/NIST.GCR.04-867, (retrieved April 25, 2019), pp. 3-2.

Dr. Sharad Oberoi leads Sanborn’s software development, IT and 3D visualization teams. He has more than 6 years of progressive expertise in the design of robotic mobile mapping systems, decision support systems and the implementation of new technologies to improve end user productivity. Dr. Oberoi holds a PhD from Carnegie Mellon University, and Masters degrees from The University of Chicago and Carnegie Mellon University. He is also a Google Cloud Certified Professional Cloud Architect.

Dr. Srini Dharmapuri, CP, CMS, PMP is with Sanborn Map Company in Pittsburgh, Pennsylvania as VP/Chief Scientist. Dr. Dharmapuri has Master of Science (Physics), Master of Technology (Remote Sensing), and Doctorate (Satellite Photogrammetry) degrees with more than 30+ years of wide-ranging experience within the Geospatial Industry, most notably with lidar, Photogrammetry, GIS and UAS.