The PDAL Project

Six months ago in this column, I described libLAS, an open source C++ library that provides classes for reading and writing the various versions and point formats of the LAS standard. At the time, we noted that it wasnt sufficient for everyones needs: it was designed around the LAS format specifically, it didnt offer a general processing pipeline mechanism, and it wasnt quite as fast as it could have be. As the market for lidar and point clouds grow, more processing functionality will be needed and the number of file formats will grow, but the architecture underlying libLAS wont scale to these needs.

Over the past year, Howard Butler (and I) have been developing a new C++ library which directly addresses those concerns. Partly in homage to GDAL [1], a very widely used library for (2D, geospatial) raster image support, this new library is named PDAL [2], the Point Data Abstraction Library. Like GDAL, PDAL provides a format-agnostic, vendor-neutral framework for processing data. (And also like GDAL, PDAL is aimed at geospatial formats and operations; for point cloud work outside the geo realm, Id suggest PCL [3], which we mentioned in an earlier column.)

In this months column, well take a quick look at the basic API and features of PDAL. PDAL is a C++ library, so this column is going to be more aimed at the developers in the audience.

Reading points

We begin with the model for reading points from a point cloud source, a reader. This code reads the first 100 points from an LAS file into a buffer object:

pdal::drivers::las::Reader reader("sample.las");

reader.initialize();

const pdal::Schema& schema = reader.getSchema();

pdal::PointBuffer data(schema, 100);

pdal::StageSequentialIterator* iterator =

reader.createSequentialIterator(data);

iterator->read(data);

This is shown pictorially in Figure 1.

The Schema object defines the types of fields in each point, e.g. three floats for X, Y, Z and a byte for the classification, and the PointBuffer object holds a set of points of the given schema. The Iterator is used to access the data within the Reader.

Filtering Points

In PDAL, data processing is done by handing points through a pipeline of processing units, each of which is a Stage. A Reader is one type of Stage; a Filter, such as cropping operation, is another type of Stage. Stages are easily composed, so a pipeline which removes points outside of a given bounding box requires only a small change to the previous example:

pdal::drivers::las::Reader reader("sample.las");

pdal::Bounds<double>bbox(0.0,0.0,0.0,100.0,100.0,100.0);

pdal::filters::Cropfilter(reader,bbox);

filter.initialize();

const pdal::Schema& schema = filter.getSchema();

pdal::PointBuffer data(schema, 100);

pdal::StageSequentialIterator* iterator =

filter.createSequentialIterator(data);

iterator->read(data);

See Figure 2.

Writing Points

As a Reader stage is the source end of a processing pipeline, a Writer stage is the sink end of the pipeline. To write 100 cropped points to a new LAS file, we would just add one more stage:

pdal::drivers::las::Reader reader("sample.las");

pdal::Bounds<double>bbox(0.0,0.0,0.0,100.0,100.0,100.0);

pdal::filters::Cropfilter(reader,bbox);

pdal::drivers::las::Writer writer(filter, stream);

writer.initialize();

// set any LAS-specific things here, such as LAS format version

writer.write(100);

See Figure 3.

Note that since were not interested in accessing the points directly ourselves in this workflow, we dont need the code which iterates to extract the point data into a PointBuffer: the Writer::write() method handles that logic for us.

With just this handful of lines, weve written a LAS-to-LAS copy operation which filters out points by spatial extent. All the data management, including buffering, is handled for you.

Schemas and Buffers

If you are writing your own processing stages, you will need to understand how point data is represented and stored in the point buffers.

In PDAL, a point cloud has an associated Schema which defines the set of Dimension objects that make up a point. You can think of the Schema as a C struct and its Dimension set as the fields of the struct. When reading from a point cloud source, the Schema is given to us, but when making up our own points to write to a file, however, we need to set up our own Schema:

Dimension d1("X", dimension:: Float, 8);

Dimension d2("Y", dimension::Float, 8);

Dimension d3("Z", dimension::Float, 8);

Dimension d4("Class", dimension::UnsignedByte, 1);

Schema schema;

schema.appendDimension(d1);

schema.appendDimension(d2);

schema.appendDimension(d3);

schema.appendDimension(d4);

The actual point data is stored in a PointBuffer:

PointBuffer*data=newPointBuffer(schema,100);

Dimensionconst&dimC=data->getSchema().getDimension("Class");

Dimensionconst&dimX=data->getSchema().getDimension("X");

Dimensionconst&dimY=data->getSchema().getDimension("Y");

Dimensionconst&dimZ=data->getSchema().getDimension("Z");

for(i=0;i<100;i++)

{

data->setField(dimX,i,my_data[i].x);

data->setField(dimY,i,my_data[i].y);

data->setField(dimZ,i,my_data[i].z);

data->setField(dimC,i,my_data[i].class);

}

Reading points from a PointBuffer is handled similarly:

my_data[i].x = data->getField<double>(dimX,i);

XML Pipelines

For situations where the library already provides the file format and filtering stages you require, PDAL includes a mechanism for creating processing pipelines from an XML description. This example performs a coordinate system reprojection:

<?xmlversion="1.0"encoding="utf-8"?>

<Pipelineversion="1.0">

<Writertype="drivers.las.writer">

<Optionname="filename">output.las</Option>

<Filtertype="filters.inplacereprojection">

<Optionname="out_srs">EPSG:4326+4326</Option>

<Optionname="scale_x">0.0000001</Option>

<Optionname="scale_y">0.0000001</Option>

<Readertype="drivers.las.reader">

<Optionname="filename">input.las</Option>

<Optionname="spatialreference">input.wkt</Option>

</Reader>

</Filter>

</Writer>

</Pipeline>

A command line tool is included to execute XML pipelines for you, so you might not need to use any C++ at all.

Custom Filtering

PDAL also provides a simple language for expressing the operations of a Filter stage, so you need not resort to C++. This can be used as a predicate, to filter out points:

# assumes input stage has field names ReturnNumber

# PDAL uses field result to determine if point should be kept or not

uint8ReturnNumber;

boolresult;

result=(ReturnNumber>0);

It also can be used to write point fields directly:

# assumesinputstagehasafieldnamedDegrees

#assumesoutputstagehasafieldnamedRadians

float32pi;

float32Degrees;

float32Radians;

pi=22.0f/7.0f;

Radians=Degrees*pi/180.0f

While not as efficient as straight C++, this mini-language allows for rapid prototyping and user-level customization without the C++ development cost.

And more

PDAL also has:

Plug-in support, allowing 3rd-party developers to implement and distribute DLLs for readers and writers for proprietary file formats.

Multiple file formats, including LAS/LAZ, QFIT, TerraSolid, and text/XYZ

Support for storing point clouds in an Oracle database

Multiple filters, including cropping, raster overlay, decimation, reprojection, mosaicking, and statistics collection

Command-line tools, similar to libLAS (but not yet as functional, however)

Many of these features are already being used in production environments.

PDAL is available under a liberal open source license (BSD) and can be downloaded from GitHub.

Credits

In addition to volunteer efforts, PDAL is supported in part by external funding sources, most notably the U.S. Army Corps / Cold Regions Research and Engineering Laboratory [4]. We thank them for their support.

Links

[1] GDAL: http://gdal.org/

[2] PDAL: http://pointcloud.org/

[3] PCL: http://pointclouds.org/

[4] US Army Corps: http://www.crrel.usace.army.mil/