Object Removal from high-precision elevation data

Full Paper
A Digital Terrain Model (DTM) is a representation of the bare-earth with
elevations at regularly spaced intervals. This data is captured via aerial
imagery or airborne laser scanning. Prior to use, all the above-ground natural
(trees, bushes, etc.) and man-made (houses, cars, etc.) structures needed to be
identified and removed so that surface of the earth can be interpolated from the
remaining points. Elevation data that includes above-ground objects is called as
Digital Surface Model (DSM). DTM is mostly generated by cleaning the objects
from DSM with the help of a human operator. Automating this workflow is an
opportunity for reducing manual work and it is aimed to solve this problem by
using conditional adversarial networks.

In theory, having enough raw and cleaned (DSM & DTM) data pairs will be a good
input for a machine learning system that translates this raw (DSM) data to
cleaned one(DTM). Recent progress in topics like 'Image-to-Image Translation
with Conditional Adversarial Networks' makes a solution possible for this
problem. In this study, a specific CAN implementation “pix2pix” is adapted to
this domain.

Data for "elevations at regularly spaced intervals" is similar to an image data, 
both can be represented as two dimensional arrays (or in other words matrices).
Every elevation point maps to an exact image pixel and even with a 1 millimeter
precision in z-axis, any real world elevation value can be safely stored in a
data cell that holds 24-bit RGB pixel data. This makes total pixel count of
image equals to total count of elevation points in elevation data. Thus,
elevation data for large areas results in sub-optimal input for "pix2pix"
and requires a tiling. Consequently, the challenge becomes "finding most
appropriate image representation of elevation data to feed into pix2pix"
training cycle. This involves iterating over "elevation-to-pixel-value-mapping
functions" and dividing elevation data into sub regions for better performing
images in pix2pix.