Logo_iSurf_org

Subscribe
to eAlerts

for Geom Site
Updates

CLICK HERE

 

Polyline Decimation (any Dim)

by Dan Sunday

 

 

Often a polyline has too much resolution for an application, such as visual displays of geographic map boundaries or detailed animated figures in games or movies. That is, the points on the polylines representing the object boundaries are too close together for the resolution of the application. For example, in a computer display, successive vertexes of the polyline may be displayed at the same screen pixel so that successive edge segments start, stay at, and end at the same displayed point. The whole polyline may even have all its vertexes mapped to the same pixel, so that it appears simply as a single point in the display!

For the sake of efficiency, one doesn't want to draw all these degenerate lines, since drawing a single point would be enough. To achieve this, one wants to reduce the vertexes and edges of the polyline to essential ones that suffice for the resolution of one's application. There are several algorithms for doing this. In effect, these algorithms approximate a high resolution polyline with a low resolution reduced polyline having fewer vertices. This also speeds up subsequent algorithms, such as area fill or intersection, that may applied to the reduced polyline or polygon.


Overview

We will consider several different algorithms for reducing the points in a polyline to produce a “decimated” polyline that approximates the original within a specified tolerance. Most of these algorithms work in any dimension since they only depend on computing the distance between points and lines. Thus, they can be applied to arbitrary 2D or 3D curves that have been approximated by a polyline, for example, by sampling a parametric curve at regular small intervals.

The first decimation algorithm, vertex reduction (VR), is a fast O(n) algorithm. It is the fastest and least complicated algorithm, but gives the coarsest result. However, it can be used as a preprocessing stage before applying other algorithms. This results in an faster combined algorithm since vertex reduction can significantly decrease the number of vertexes that remain for input to other decimation algorithms.

The second algorithm is the classical Douglas-Peucker (DP) approximation algorithm that is used extensively for both computer graphics and geographic information systems. There are two variants of this algorithm, the original O(n2) method [Douglas & Peucker, 1973] and a more recent O(nlog-n) one [Hershberger & Snoeyink, 1992]. Unfortunately, as is often the case, the faster algorithm is more complicated to implement. Additionally, it is not as general, and only works for simple 2D planar polylines, but not in higher dimensions.

Finally, we combine these algorithms, VR followed by DP, in our C++ code implementation of poly_decimate() as a fast practical high-quality polyline approximation algorithm.


Vertex Cluster Reduction

In vertex cluster reduction, successive vertexes that are clustered too closely are reduced to a single vertex. For example, if a polyline is being drawn in a computer display, successive polygon vertexes may be drawn at the same pixel if they are closer than some fixed application tolerance. For example, in a large range geographic map display, two successive vertexes of a boundary polyline may be separated by many miles, and still be drawn at the same pixel pointg on the display screen; and the edge segments joining them are also all being drawn from this pixel to itself. One would like to discard these redundant vertexes so that successive displayed vertexes are separated several pixels, and edge segments are not just drawn as points.

Vertex cluster reduction is a brute-force algorithm for polyline simplification. For this algorithm, a polyline vertex is discarded when its distance from a prior initial vertex is less than some minimum tolerance delta.gt.0. Specifically, after fixing an initial vertex V0, successive vertexes Vi are tested and rejected if they are less than delta-lc away from V0. But, when a vertex is found that is further away than delta-lc, then it is accepted as part of the new simplified polyline, and it also becomes the new initial vertex for further simplification of the polyline. Thus, the resulting edge segments between accepted vertexes are larger than the delta-lc tolerance. This procedure is easily visualized as follows:

Pic_reduce

 

Pseudo-Code: Vertex Reduction

Here is the straightforward pseudo-code for this algorithm:

Input: tol = the approximation tolerance
        LAMBDA = {V0,V1,...,Vn-1} is a n-vertex polyline

Set start = 0;
Set k = 0;
Set W0 = V0;
for each vertex Vi (i=1,n-1)
{
    if Vi is within tol from Vstart
    then ignore it, and continue with the next vertex

    else Vi is further than tol away from Vstart {
        add it as a new vertex of the reduced polyline
        Increment k++;
        Set Wk = Vi;
        Set start = i; as the new initial vertex
    }
}

Output: OMEGA = {W0,W1,...,Wk-1} = the k-vertex reduced polyline

 

This is a fast O(n) algorithm. It should be implemented comparing squares of distances with the squared tolerance to avoid expensive square root calculations. This algorithm is the first preprocessing stage in our implementation for poly_decimate() given below.

 
Douglas-Peucker Algorithm

Whereas vertex reduction uses closeness of vertexes as a rejection criterion, the Douglas-Peucker (DP) algorithm uses the closeness of a vertex to an edge segment. This algorithm works from the top down by starting with a crude initial guess at a simplified polyline, namely the single edge joining the first and last vertexes of the polyline. Then the remaining vertexes are tested for closeness to that edge. If there are vertexes further than a specified tolerance, delta.gt.0, away from the edge, then the vertex furthest from it is added the simplification. This creates a new approximation for the simplified polyline. Using recursion, this process continues for each edge of th current approximation until all vertexes of the original polyline are within tolerance of the simplification.

More specifically, in the DP algorithm, the two extreme endpoints of a polyline are connected with a straight line as the initial rough approximation of the polyline. Then, how well it approximates the whole polyline is determined by computing the distances from all intermediate polyline vertexes to that (finite) line segment. If all these distances are less than the specified tolerance delta-lc, then the approximation is good, the endpoints are retained, and the other vertexes are eliminated. However, if any of these distances exceeds the delta-lc tolerance, then the approximation is not good enough. In this case, we choose the point that is furthest away as a new vertex subdividing the original polyline into two (shorter) polylines, as illustrated in the following diagram.

Pic_DP-1

This procedure is repeated recursively on these two shorter polylines. If at any time, all of the intermediate distances are less than the delta-lc threshold, then all the intermediate points are eliminated. The routine continues until all possible points have been eliminated. Successive stages of this process are shown in the following example.


 

Pseudo-Code: Douglas-Peucker

Here is pseudo-code for this algorithm:

Input:  tol = the approximation tolerance
          L = {V0,V1,...,Vn-1}  is any n vertex polyline

// Mark vertexes that will be in the reduced polyline
Initially Mark V0 and Vn

// Recursively decimate by selecting vertex furthest away
decimate(tol, L,  0, n);

// Copy Marked vertexes to the reduced polyline
for (i=m=0; i<=n; i++) {
    if (Vi is marked) { add it to the output polyline W
        Wm = Vi;
        m++;
    }
}

Output: W = {W0,W1,...,Wm-1} = the reduced m-vertex polyline

//-------------------------------------------------------------
// This is the recursive decimation routine
decimate(tol, L,  j, k)
{
    if (k j+1) there is nothing to decimate
        return immediately
    otherwise
    test distance of intermediate vertexes from segment Vj to Vk

    Put Sjk  = the segment from Vj to  Vk
    Put maxd = 0 is the distance of farthest vertex from Sjk
    Put maxi = j is the index of the vertex farthest from Sjk

    for each intermediate vertex Vi (i=j+1,k-1)
    {
        Put dv = distance from Vi to segment Sjk
        if (dv <= maxd) then Vi is  not farther away, so
            continue to  the next vertex
        otherwise Vi is a new max  vertex
        Put maxd = d and maxi = i to remember the farthest vertex
    }
    if (maxd > tol) a vertex is further than tol from Sjk
    {
        // split the polyline at the farthest  vertex
        Mark Vmaxi as part of the reduced polyline
        and recursively decimate the two  subpolylines
        decimate(tol, L, j, maxi);
        decimate(tol, L, maxi, k);
    }
}

 

This algorithm is O(mn) worst case time, and O(nlog-m) expected time, where m is the size of the reduced polyline. Note that this is an output dependent algorithm, and will be very fast when m is small. This algorithm is implemented as the second stage of our poly_decimate() routine given below. The first stage of our implementation does vertex reduction on the polyline before invoking the DP algorithm. This results in a fast high-quality polyline approximation/simplification algorithm.

 
Convex Hull Speed-Up

[Hershberger & Snoeyink, 1992] describe an interesting improvement for speeding up the Douglas-Peucker algorithm, making it a worst case O(nlog-n) time algorithm. They do this by speeding up the selection of the farthest intermediate vertex from the segment Sij = ViVj. This is achieved by noting that the farthest vertex must be on the convex hull of the polyline chain from Vi to Vj. Then, they compute this hull in O(n) time using Melkman's algorithm for the hull of a simple polyline (see Algorithm 12), and find the vertex farthest from Sij using an O(log-n) binary search on the hull vertexes (see Algorithm 14). They next show how to reuse the hull information when a polyline chain is split at the farthest vertex. The details for putting the whole algorithm together are a bit delicate, but in the final analysis it is shown to have worst case O(nlog-n) time.

There are a few caveats about this algorithm. First, it depends on using the Melkman O(n) time hull algorithm, and thus the polyline must be simple, which is often true in many applications. Second, since this algorithm depends on a 2D convex hull algorithm, it only applies to planar polylines, whereas the VR and DP algorithms can be used for any dimension. Next, this algorithm is more complicated than the original DP algorithm, and is more difficult to code and debug. Fortunately, the [Hershberger & Snoeyink, 1992] paper has an Appendix with a complete "C" code implementation. Finally, although they improve on the worst case behavior, the original DP algorithm is usually O(nlog-m) and can do better in the best cases where m is small. So, altogether its a stalemate over which algorithm is best, and using the easier to code, more general DP algorithm is usually preferred.


Implementation

Here is a sample "C++" implementation of this algorithm.

// Copyright 2002 softSurfer, 2012 Dan Sunday
// This code may be freely used and modified for any purpose
// providing that this copyright notice is included with it.
// SoftSurfer makes no warranty for this code, and cannot be held
// liable for any real or imagined damage resulting from its use.
// Users of this code must verify correctness for their application.
 

// Assume that classes are already given for the objects:
//    Point and Vector with
//        coordinates {float x, y, z;}     // as many as are needed
//        operators for:
//            == to test equality
//            != to test inequality
//            (Vector)0 = (0,0,0)         (null vector)
//            Point  = Point Vector
//            Vector = Point - Point
//            Vector = Vector Vector
//            Vector = Scalar * Vector    (scalar product)
//            Vector = Vector * Vector    (cross product)
//    Segment with defining endpoints {Point P0, P1;}
//===================================================================
 

// dot product (3D) which allows vector operations in arguments
#define dot(u,v)  ((u).x * (v).x + (u).y * (v).y + (u).z * (v).z)
#define norm2(v)  dot(v,v)         // norm2 = squared length of vector
#define norm(v)   sqrt(norm2(v))   // norm = length of vector
#define d2(u,v)   norm2(u-v)       // distance squared = norm2 of difference
#define d(u,v)    norm(u-v)        // distance = norm of difference
 


// poly_decimate(): - remove vertices to get a smaller approximate polygon
//    Input:  tol = approximation tolerance
//            V[] = polyline array of vertex points
//            n   = the number of points in V[]
//    Output: sV[]= reduced polyline vertexes (max is n)
//    Return: m   = the number of points in sV[]
int
poly_decimate( float tol, Point* V, int n, Point* sV )
{
    int    i, k, m, pv;             // misc counters
    float  tol2 = tol * tol;        // tolerance squared
    Point* vt = new Point[n];       // vertex buffer
    int*   mk = new int[n] = {0};   // marker  buffer

    // STAGE 1.  Vertex Reduction within tolerance of  prior vertex cluster
    vt[0] = V[0];               // start at the beginning
    for (i=k=1, pv=0; i<n; i++) {
        if (d2(V[i], V[pv]) < tol2)
            continue;
        vt[k++] = V[i];
        pv = i;
    }
    if (pv < n-1)
        vt[k++] = V[n-1];       // finish at the end

    // STAGE 2.  Douglas-Peucker polyline reduction
    mk[0] = mk[k-1] = 1;       //  mark the first and last vertexes
    poly_decimateDP( tol, vt, 0, k-1, mk  );

    // copy marked vertices to the reduced polyline
    for (i=m=0; i<k; i++) {
        if (mk[i])
            sV[m++] =  vt[i];
    }
    delete vt;
    delete mk;
    return m;         //  m vertices in reduced polyline
}


// poly_decimateDP():
//  This is the Douglas-Peucker recursive reduction routine
//  It marks vertexes that are part of the reduced polyline
//  for approximating the polyline subchain v[j] to v[k].
//    Input:  tol  = approximation tolerance
//            v[]  = polyline array of vertex points
//            j,k  = indices for the subchain v[j] to v[k]
//    Output: mk[] = array of markers matching vertex array v[]
void
poly_decimateDP( float tol, Point* v, int j, int k, int* mk )
{
    if (k <= j+1) // there is nothing to decimate
        return;

    // check for adequate approximation by segment S from v[j] to v[k]
    int     maxi = j;           // index of vertex farthest from S
    float   maxd2 = 0;          // distance squared of farthest vertex
    float   tol2 = tol * tol;   // tolerance squared
    Segment S = {v[j], v[k]};   // segment from v[j] to v[k]
    Vector  u = S.P1 - S.P0;    // segment direction vector
    double  cu = dot(u,u);      // segment length squared

    // test each vertex v[i] for max distance from S
    // compute using the Algorithm dist_Point_to_Segment()
    // Note: this works in any dimension (2D, 3D, ...)
    Vector  w;
    Point   Pb;                 // base of perpendicular from v[i] to S
    double  b, cw, dv2;         // dv2 = distance v[i] to S squared

    for (int i=j+1; i<k; i++)
    {
        // compute distance squared
        w = v[i] - S.P0;
        cw = dot(w,u);
        if ( cw <= 0 )
            dv2 =d2(v[i], S.P0);
        else if ( cu <= cw )
            dv2 =d2(v[i], S.P1);
        else {
            b = cw / cu;
            Pb = S.P0 + b  * u;
            dv2 =d2(v[i], Pb);
        }
        // test with current max distance  squared
        if (dv2 <= maxd2)
            continue;
        // v[i] is a new max vertex
        maxi = i;
        maxd2 = dv2;
    }
    if (maxd2 > tol2)         // error is worse than the tolerance
    {
        // split the polyline at the farthest  vertex from S
        mk[maxi] = 1;       // mark v[maxi] for the reduced polyline
        // recursively decimate the two subpolylines at v[maxi]
        simplifyDP(  tol, v, j, maxi, mk );  // polyline v[j] to v[maxi]
        simplifyDP(  tol, v, maxi, k, mk );  // polyline v[maxi] to v[k]
    }
    // else the approximation is OK, so ignore intermediate vertexes
    return;
}
//===================================================================
 


References

David Douglas & Thomas Peucker,  "Algorithms for the reduction of the number of points required to represent a  digitized line or its caricature", The Canadian Cartographer 10(2), 112-122   (1973)

John Hershberger & Jack Snoeyink,  "Speeding Up the Douglas-Peucker Line-Simplification Algorithm", Proc 5th Symp  on Data Handling, 134-143 (1992).  UBC Tech Report available online from CiteSeer.

 

© Copyright 2012 Dan Sunday, 2001 softSurfer