Distance computations are fundamental in computational geometry, and there are wellknown formulas for them. Nevertheless, due to differences in object representations, there are alternative solutions to choose from. We will give some of these and indicate the situations they apply to.
Throughout, we need to have a metric for calculating the distance between two points. We assume this is the standard Euclidean metric “L_{2} norm” based on the Pythagorean theorem. That is, for an ndimensional vector , its length v is given by:
and for two points and , the distance between them is:
Lines
A point is that which has no part. [Book I, Definition 1] A line is breadthless length. [Book I, Definition 2] The extremities of a line are points. [Book I, Definition 3] A straight line is a line which lies evenly with the points on itself. [Book I, Definition 4]
To draw a straight line from any point to any point. [Book I, Postulate 1] To produce a finite straight line continuously in a straight line. [Book I, Postulate 2] [Euclid, 300 BC]
The primal way to specify a line L is by giving two distinct points, P_{0 }and P_{1}, on it. In fact, this defines a finite line segment S going from P_{0 }to P_{1} which are the endpoints of S. This is how the Greeks understood straight lines, and it coincides with our natural intuition for the most direct and shortest path between the two endpoints. This line can then be extended indefinitely beyond either endpoint producing infinite rays in both directions. When extended simultaneously beyond both ends, one gets the concept of an infinite line which is how we often think of it today. However, for the Greeks, the line was the finite segment which could be extended indefinitely using a straight edge. One nice thing about defining lines this way is that it works for all dimensions: 2D, 3D, or any ndimensional space. Further, it is common in applications to have lines specified as segments given by their endpoints since finite segments often occur as an edge of a polygon, polyhedron, or an embedded graph.
A line L can also be defined by a point and a direction. Let P_{0} be a point on L and v_{L} be a nonzero vector giving the direction of the line. This is equivalent to the two point definition, since we could just put v_{L}_{ }= (P^{1 }– P_{0}). Or, given P_{0} and v_{L}, we could select P_{1 }= P_{0 }+ v_{L} as a second point on the line. If v_{L} is “normalized” to be the unit direction vector, u_{L }= v_{L }/ v_{L}, then its components are the direction cosines of L. That is, in ndimensions, let (i = 1,n) be the angle that L makes with the ith coordinate axis a_{i} (for example, in 2D, a_{1} is the xaxis and a_{2} is the yaxis). Then, the vector v_{L} = (v_{i}), with , is a direction vector for L. In 2D, as shown in the diagram, if is the angle L makes with the xaxis, then , and is a unit direction vector for L, since . Similarly in any dimension n, the squares of the direction cosines sum to 1, that is .
Line Equations
Lines can also be defined by equations using the coordinates of points on the line as unknowns. The types of equations that one encounters in practice are:
Type


Equation


Usage

Explicit 2D




a nonvertical 2D line

Implicit 2D




any 2D line

Parametric




any line in any dimension


The 2D explicit equation is the one most people are first taught in school, but it is not the most flexible one to use in computational software. The implicit equation is a bit more useful, and the conversion of an explicit to an implicit equation is easy to do. Note that the first two implicit coefficients always define a vector n_{L}_{ }= (a,b) which is perpendicular to the line L. This is because for any two points and on L, we have . We say that n_{L} is a normal vector for L to mean that it is perpendicular to the line. Further, given any normal vector n_{L}_{ }= (a, b) to the line L and a point P_{0} on it, the normal form of the implicit equation is:
This equation is said to be normalized if , and then n_{L} is called a unit normal vector, which we often denote as u_{L} to emphasize that it has unit 1 length. I know this overuse of the word "normal" may be confusing, but that's the terminology in current use.
Unfortunately, a single implicit (or explicit) equation only defines a line in 2D, whereas in 3D a single linear equation defines a plane and in ndimensions it defines an (n1)dimensional hyperplane [Hanson, 1994]. This, of course, is useful in its own right, but it is not our interest here. For further information about 3D planes, see the Algorithm 4 discussion about Planes.
On the other hand, in any ndimensional space, the parametric equation for the line is valid and is the most versatile one to use. For a line defined by two points P_{0 }and P_{1} with a direction vector v_{L}, the equation can be written several ways; namely:
where t is a real number. With this representation P(0) = P_{0} , P(1) = P_{1} , and P(t) with 0 < t < 1 is a point on the finite segment between P_{0} and P_{1} where t is the fraction of P(t)'s distance along the whole P_{0}P_{1} line segment. That is, . And, is the midpoint of the segment. Further, if t < 0 then P(t) is outside the segment on the P_{0} side, and if t > 1 then P(t) is outside on the P_{1} side.
One can convert from any of these representations to another when convenient. For example, given two points and on a 2D line, one can derive an implicit equation for it as follows. With for the line direction vector, we have is a normal vector perpendicular to L, since n_{L}·v_{L}= 0. Then, an implicit equation for L is:
where the coefficients of x and y are the coordinates of n_{L}.
For another example, in 2D, if a line L makes an angle with the xaxis, recall that is a unit direction vector, and thus is a unit normal vector. So, if is a point on L, then a normalized implicit equation for L is:
Further, the parametric line equation is:
Distance from a Point to a Line
Given a line L and any point P, let d(P,L) denote the distance from P to L. This is the shortest distance separating P and L. If L is an infinite line, then this is the length of a perpendicular dropped from P to L. However if L is a finite segment S, then the base of the perpendicular to the extended line may be outside the segment, and a different determination of the shortest distance needs to be made. We first consider perpendicular distance to an infinite line.
The 2Point Line (2D and 3D)
In 2D and 3D, when L is given by two points P_{0 }and P_{1}, one can use the crossproduct to directly compute the distance from any point P to L. The 2D case is handled by embedding it in 3D with a third zcoordinate = 0. The key observation to make is that the magnitude of the crossproduct of two 3D vectors is equal to the area of the parallelogram spanned by them, since where is the angle between the two vectors v and w. However, this area is also equal to the magnitude of the base times the height of the parallelogram, and we can arrange the geometry so that the height is the distance d(P,L). Let and as in the diagram:
Then, v_{L}_{ }x w = Area( parallelogram(v_{L}, w) ) = v_{L} d(P, L) which results in the easy formula:
where u_{L}= v_{L} / v_{L} is the unit direction vector of L. If one is computing the distances of many points to a fixed line, then it is most efficient to first calculate u_{L}.
For the embedded 2D case with P = (x, y, 0), the crossproduct becomes:
and the distance formula is:
We did not take the absolute value of the numerator here making this is a signed distance with positive values on one side of L and negative distances on the other. This can sometimes be useful. Other times one may want to take the absolute value. Also note the similarity of the numerator and the implicit line equation.
The 2D Implicit Line
In 2D, there are applications where a line L is most easily defined by an implicit equation . For any 2D point P = (x, y), the distance d(P,L) can be computed directly from this equation.
Recall that the vector n_{L }= (a, b) is perpendicular to the line L. Using n_{L}, we can compute the distance of an arbitrary point P to L by first selecting any specific point P_{0} on L and then projecting the vector P_{0}P onto n_{L}. as shown in the diagram:
Writing out the details, (1) since not both a and b are zero, assume a 0 and select P_{0 }= (c / a, 0) which is on the line [Otherwise, if a = 0 then b 0, and select P_{0 }= (0, c / b) instead, which yields the same final result] (2) for any P_{0} on L we have: (3) also for our specific P_{0}: and equating (2) and (3) yields the formula:
Further, one can divide the coefficients of f(x,y) by n_{L} to prenormalize the implicit equation so that n_{L} = 1. This results in the very efficient formula:
which has only 2 multiplications and 2 additions for each distance calculation. So, if in 2D one needs to compute the distances of many points to the same infinite line L, then one should derive the unit normalized implicit equation and use this formula. Also, if one is just comparing distances (say, to find the closest or farthest point to the line), then normalizing is not even needed since it just changes all computed distances by a constant factor.
Recall that when L makes an angle with the xaxis and is any point on L, then the normalized implicit equation has: a = –sin(), b = cos(), and c = x_{0 }sin() – y_{0} cos().
The Parametric Line (any Dimension n)
To compute the distance d(P,L) (in any ndimensional space) from an arbitrary point P to a line L given by a parametric equation, suppose that P(b) is the base of the perpendicular dropped from P to L. Let the parametric line equation be given as: P(t) = P_{0} + t (P_{1 }– P_{0}). Then, the vector P_{0}P(b) is the projection of the vector P_{0}P onto the segment P_{0}P_{1}, as shown in the diagram:
So, with v_{L}_{ }= (P_{1 }– P_{0}) and w = (P – P_{0}), we get that:
and thus:
where u_{L} is our friend the unit direction vector of L.
This computation has the advantage of working for any dimension n and of also computing the base point P(b) which is sometimes useful. In 3D, it is just as efficient as the cross product formula. But in 2D, when P(b) is not needed, the implicit method is better, especially if one is computing the distances of many points to the same line.
Distance from a Point to a Ray or Segment (any Dimension n)
A ray R is a half line originating at a point P_{0} and extending indefinitely in some direction. It can be expressed parametrically as P(t) for all with P(0) = P_{0} as the starting point. A finite segment S consists of the points of a line that are between two endpoints P_{0 }and P_{1}. Again, it can be represented by a parametric equation with P(0) = P_{0} and P(1) = P_{1} as the endpoints and the points P(t) for as the segment points.
The thing that is different about computing distances of a point P to a ray or a segment is that the base P(b) of the perpendicular from P to the extended line L may be outside the range of the ray or segment. In this case, the actual shortest distance is from the point P to the start point of the ray or one of the endpoints of a finite segment.
For a ray there is only one choice, but for a segment one must determine which end of the segment is closest to P. One could just compute both distances and use the shortest, but this is not very efficient. Also, one must first determine that P's perpendicular base is actually outside the segment's range. An easy way to do this is to consider the angles between the segment P_{0}P_{1} and the vectors P_{0}P and P_{1}P from the segment endpoints to P. If either of these angles is 90º, then the corresponding endpoint is the perpendicular base P(b). If the angle is not a right angle, then the base lies to one side or the other of the endpoint according to whether the angle is acute or obtuse. These conditions are easily tested by computing the dot product of the vectors involved and testing whether it is positive, negative, or zero. The result determines if the distance should be computed to one of the points P_{0 }or P_{1}, or as the perpendicular distance to the line L itself. This technique, which works in any ndimensional space, is illustrated in the diagrams:
Since w_{0} = v + w_{1}, the two tests can be done just using the two dot products w_{0}·v and v·v which are also the numerator and denominator of the formula to find the parametric base of the perpendicular from P to the extended line L of the segment S. This lets us streamline the algorithm as shown in the pseudo code:
distance( Point P, Segment P0:P1 ) { v = P1  P0 w = P  P0
if ( (c1 = w·v) <= 0 ) // before P0 return d(P, P0) if ( (c2 = v·v) <= c1 ) // after P1 return d(P, P1)
b = c1 / c2 Pb = P0 + bv return d(P, Pb) }


Implementations
Here are a few sample "C++" applications using these algorithms. We assume that the low level classes and functions are already given.
// Copyright 2001 softSurfer, 2012 Dan Sunday // This code may be freely used and modified for any purpose // providing that this copyright notice is included with it. // SoftSurfer makes no warranty for this code, and cannot be held // liable for any real or imagined damage resulting from its use. // Users of this code must verify correctness for their application.
// Assume that classes are already given for the objects: // Point and Vector with // coordinates {float x, y, z;} (z=0 for 2D) // appropriate operators for: // Point = Point ± Vector // Vector = Point  Point // Vector = Scalar * Vector // Line with defining endpoints {Point P0, P1;} // Segment with defining endpoints {Point P0, P1;} //===================================================================
// dot product (3D) which allows vector operations in arguments #define dot(u,v) ((u).x * (v).x + (u).y * (v).y + (u).z * (v).z) #define norm(v) sqrt(dot(v,v)) // norm = length of vector #define d(u,v) norm(uv) // distance = norm of difference
// closest2D_Point_to_Line(): find the closest 2D Point to a Line // Input: an array P[] of n points, and a Line L // Return: the index i of the Point P[i] closest to L int closest2D_Point_to_Line( Point P[], int n, Line L) { // Get coefficients of the implicit line equation. // Do NOT normalize since scaling by a constant // is irrelevant for just comparing distances. float a = L.P0.y  L.P1.y; float b = L.P1.x  L.P0.x; float c = L.P0.x * L.P1.y  L.P1.x * L.P0.y;
// initialize min index and distance to P[0] int mi = 0; float min = a * P[0].x + b * P[0].y + c; if (min < 0) min = min; // absolute value
// loop through Point array testing for min distance to L for (i=1; i<n; i++) { // just use dist squared (sqrt not needed for comparison) float dist = a * P[i].x + b * P[i].y + c; if (dist < 0) dist = dist; // absolute value if (dist < min) { // this point is closer mi = i; // so have a new minimum min = dist; } } return mi; // the index of the closest Point P[mi] } //===================================================================
// dist_Point_to_Line(): get the distance of a point to a line // Input: a Point P and a Line L (in any dimension) // Return: the shortest distance from P to L float dist_Point_to_Line( Point P, Line L) { Vector v = L.P1  L.P0; Vector w = P  L.P0;
double c1 = dot(w,v); double c2 = dot(v,v); double b = c1 / c2;
Point Pb = L.P0 + b * v; return d(P, Pb); } //===================================================================
// dist_Point_to_Segment(): get the distance of a point to a segment // Input: a Point P and a Segment S (in any dimension) // Return: the shortest distance from P to S float dist_Point_to_Segment( Point P, Segment S) { Vector v = S.P1  S.P0; Vector w = P  S.P0;
double c1 = dot(w,v); if ( c1 <= 0 ) return d(P, S.P0);
double c2 = dot(v,v); if ( c2 <= c1 ) return d(P, S.P1);
double b = c1 / c2; Point Pb = S.P0 + b * v; return d(P, Pb); } //===================================================================
References
David Eberly, "Distance Between Point and Line, Ray, or Line Segment", Geometric Tools (2002)
Euclid, The Elements, Alexandria (300 BC)
Andrew Hanson, "Geometry for NDimensional Graphics" in Graphics Gems IV (1994)
Thomas Heath, The Thirteen Books of Euclid's Elements, Vol 1 (Books I and II) (1956)
Jack Morrison, "The Distance from a Point to a Line", in Graphics Gems II (1994)
