Professional software
available in the domain of classification of LiDAR data as well as
the literature available for feature extraction cum classification
reveal that a certain hierarchy has to be followed in the process.
Before proceeding to extract the features present on the terrain
through the process of classification, we should recall that in a
typical terrain there are ground, buildings, trees, roads and
vehicles. The errors present in capturing the data often results in
outliers also being present in the dataset.
One typically begins
with the elimination of outliers. Outliers are those points which lay
isolated within a sphere of radius $\varepsilon$. If the sphere
contains less than $n$ points, the points are labelled as outliers.
Both $\varepsilon$ and $n$ are thresholds that are provided by the
user of the software or the classification algorithm. To find the
outliers, each of the points in the dataset is considered as the
center of a sphere of radius $\varepsilon$, and then the number of
points within the sphere are counted. If the number is less than $n$,
the said point is labelled as an outlier.
The popular software
named TerraSolid (which works on the CAD platform Microstation), also
has a concept of finding low lying points in addition to the first
step of finding outliers. This routine is present in the utility
named TerraScan. To find a low lying point it assumes a cylindrical
neighbourhood of a point, where the axis of the cylinder is the
z-direction. If for a given threshold $h$, which forms the height of
the cylinder, there is no other point contained in the cylinder, then
the point is labelled as low-lying.
The next step after the
classification of the outliers and low-lying points, is to classify
the ground points. Since 1999, several algorithms have been
researched on for labelling the ground points. Sithole and Vosselman
(2004) have provided an interesting review of the algorithms for
ground classification. In their paper, the authors review and test
the performance of several algorithms on the ISPRS test dataset. It
has been reported later in the literature that these algorithms were
not suitable for all the terrains, and therefore some interesting
additional algorithms have been developed. The issues with ground
point classification have been reviewed and addressed in a paper by
Meng, Currit and Zhao (2010). The process of classification of the
ground points has been referred in the literature as “filtering”.
In TerraSolid, in addition to the slope of the ground, information
regarding the longest edge of a building is also sought. This
information is required in order to avoid classifying a pretty long
building as a ground cluster.
After the ground points
have been “filtered out”, there are trees and buildings to be
detected. Some of the urban areas do not even contain trees, but some
do! Sometimes, the trees are too close to the buildings. If the
intensity information is not used, the tree points and the building
points appear to be in the same cluster.
There could be multiple
strategies for building extraction from the unclassified datasets.
TerraSolid people first classify the low-vegetation and
high-vegetation points just by their height from the ground. The
remaining points are then classified as buildings using their own set
of algorithms. Although this sounds pretty crude, it does help. The
buildings can be then reconstred into a CAD model as TerraSolid sits
on a Microstation enviroment. We shall deal with building extraction
from LiDAR data in a separate post.
Road points could be
classified from the ground points themselves. Researchers have
reported the use of intensity values from LiDAR data to separate the
road and other points. However, the problem becomes different when
bridges and flyovers have to detected. Apart from the intensity
values, the height values also need to be used.
Trees could be detected
using a template matching procedure. A botanist usually knows the
shape of a tree. The property that LiDAR data can capture multiple
storeys from the trees, comes in handy here. Tree templates are
available as RPC (Rational Polynomial Coefficient) models for
purchase. However, this database is pretty limited. There is a
research opportunity to create these RPC models by scanning different
forests. The Indian biodiversity is pretty high, and an initiative to
create tree models for the different species of trees available in
India (at least) will be an excellent direction for research and
development.
- Meng, Currit and Zhao (2010), Ground Filtering Algorithms for Airborne LiDAR Data: A Review of Critical Issues , doi:10.3390/rs2030833