\input CS445Hdr
\def\file{CS445N8.tex of November 23, 1982 7:06 PM}
\def\header{8. Closest-point problems}
\def\title{Lectures 17--18}
\chap{\header.}

The basic {\it post-office problem} can be described
as follows: given a {\it query point} $x$ and a finite
set $P$ of distinct {\it data points}, find $p\in P$ such
that $\dist(p,x)$ is minimum. We are interested mainly
in the special case where $\dist$ is the usual Euclidean
measure, and the points belong to the some $k$-dimensional
real space $\Sscr$; in fact, we will be concerned almost exclusively with the planar case ($k=2$).\par

The post-office problem has been studied by many
famous mathematicians in the last two centuries
{\bf ??? names and dates.} It has
inspired many important concepts of Computational
Geometry, and has a surprisingly large (and still
growing) set of applications. {\bf ??? from Astronomy
to Zoology.}\par

The trivial algorithm for finding the closest point
consist in computing all distances
$\rset{\dist(p,x)\relv p\in P}$ and selecting the minimum.
This requires $\Theta(n)$ time for points in the
plane ($\Theta(kn)$ if the points belong to a $k$-dimensional
space). Since every coordinate of every point must be
examined, this algorithm is optimal.\par

However, it is often the case that we have a great many
different query points $x$ to search against a fixed data set
$P$. As we might expect, by pre-processing the set $P$ we
can reduce significantly the time required to answer each
query.\par

Although several algorithms and data structures have been
proposed for solving this problem, the most useful
and versatile (at least for the planar version) seems
to be the ones based on the Voronoi diagram, whose study
constitutes the bulk of this chapter. As we shall see, the
Voronoi algorithms for the plane generally
require $O(n\log n)$ preprocessing time, $O(n)$ space,
and $O(\log n)$ time per query, in the worst case.\par

Although their efficiency is significantly better than
the exhaustive search, Voronoi algorithms and
data structures are moderately complex, and their
use may be hard to justify in some applications.\par

An alternative that should be considered is the
``bucket’’ algorithm (8.???), which
is simpler, works for spaces of any dimension $k$,
and is quite efficient on the average:
it takes only $O(kn)$ preprocessing time, plus
$O(2↑k)$ time per query. Unfortunately, to derive these
bounds we have to make some strong assumptions
on the the probability distribution of $P$,
and these assumptions hold only in very few cases.
In the worst case, algorithm 8.???
performs no better than the trivial exhaustive search.

\sect {8.1. The Voronoi diagram.}

As a warm-up exercise, let’s consider first the
case where the space $\Sscr$ has dimension 1,
i.e. is a straight line. We can think of each point as a real
number (its coordinate from some fixed origin on the line),
so the Euclidean distance between two points $p$ and $q$ is
$\leftv p-q \rightv$.\par

Let $p$ and $q$ be two data points with $p<q$.
A query point $x$ on the line will be closer to
$p$ than to $q$ if and only if it is to the left of the
midpoint $u=(p+q)/2$; if $x=u$, $x$ will be
equidistant from both.\fig2\par

For each $p\in P$, let $V(p)$ denote the set of all query
points $x$ for which $p$ is the closest data point, i.e.
$\dist(x, p)\leq\dist(x,q)$ for all $q\in P-\{p\}$. Let
$p1̌ < p2̌ < \cdots < pň$ be the elements of $P$; it
is easy to see that
$V1̌=(u0̌, u1̌]$, $Vň=[uň-1, uň)$, and
$V(pǐ)=(u{̌i-1}, uǐ)$ for $1<i<n$, where
$u0̌=-\infty$, $uň=+\infty$, and $uǐ=(pǐ+p{̌i+1})/2$.\fig2\par

The sets $V(pǐ)$ can be represented in $O(n)$ space, by a
sorted table of the midpoints $uǐ$. For each query $x$, we
can use binary search to find, in $O(\log n)$ time, an $i$
such that $u{̌i-1} < x < uǐ$, meaning $x\in V(pǐ)$ and
$pǐ$ is the data point closest to $x$ (if $x=uǐ$ for some $i$,
then $x\in V(p1̌)\inte V(p{̌i+1})$; i.e., the points $pǐ$ and $p{̌i+1}$ are equidistant from $x$
and closer to it than any other data points).
The computation of the
midpoints $uǐ$ requires the data points to be sorted, so the
preprocessing may take $\Omega(n\log n)$ time in the worst
case.\par

These ideas can be easily generalized for an arbitrary
space $\Sscr$ with dimension $k$. As above, for each
$p\in P$ we denote by $V(p)$ the set of all possible queries
for which $p$ is the closest data point, i.e.
$$V(p)=\rset{x\in\Sscr\relv (\forall q\in P)
\dist(x,p)\leq\dist(x,q)}.\eqno(1)$$
The set $V(p)$ is called the {\it Voronoi region} of $p$, and
the function $V$ is the {\it Voronoi diagram} for the data set $P$\foot1{The names {\it Dirichlet region} and {\it Dirichlet tesselation} are also quite common.}.\par

Obviously, the region $V(p)$ depends not only on $p$ but on the set $P$ as a whole. Therefore, when $P$ is
not clear from the context, we will denote its Voronoi diagram by $VP̌$.\par

\sect{8.2. Properties of the Voronoi diagram.}

It is clear from the definition that every point of $\Sscr$ belongs to one or more Voronoi regions. Furthermore, if the closest pair of data points is $d$ apart, then the ball of radius $d\over 2$ centered at any data point $p$ is in $V(p)$; this proves that $p$ is always an interior point of $V(p)$.\fig2\par

The {\it bissector} $B(p,q)$ of two distinct
points $p,q$ is defined as the $(k-1)$-dimensional
hyperplane of $\Sscr$ that is orthogonal
to the segment $pq$ and passes through its middle point.
When $\Sscr$ is the Euclidean plane, the bissector is just the
perpendicular drawn through the midpoint of $pq$.\fig3\par

Let $C(p,q)$ be the set of all points $x$
of $\Sscr$ for which $\dist(x,p)\leq\dist(x,q)$; that is, $C(p, q)$ is the region corresponding to $p$ in the two-point Voronoi diagram $V{̌\set{p, q}}$. The structure
of $C(p,q)$ is actually quite simple:\par

\lemma{8.1}{If $p\neq q$, the set $C(p,q)$ is
the closed half-space of $\Sscr$ that contains $p$ and is
bounded by the bissector $B(p,q)$.}
\proof{Let $x$ be a point of $\Sscr$, $\lscr$ be
the line through $p$ and $q$, and $o$ be the
midpoint of $pq$. Let $y$ be
the orthogonal projection of $x$ onto $\lscr$.\fig3
Since $xy$ is perpendicular to $\lscr$, we have
\def\ifff{\mathrel{\hbox{\ iff \ }}}
$$\eqalign{\bar{xp}\leq\bar{xq}|
\ifff\bar{xp}↑2\leq\bar{xq}↑2\cr
\null|\ifff\bar{xy}↑2 + \bar{yp}↑2\leq\bar{xy}↑2+\bar{yq}↑2\cr
\null|\ifff\bar{yp}↑2\leq\bar{yq}↑2\cr
\null|\ifff\bar{yp}\leq\bar{yq}.\cr}$$
The last condition holds if and only if $y$ is on the
ray of $\lscr$ with origin $o$ that contains $p$. Since $xy$
is parallel to the bissector, the latter cannot separate $y$
from $x$ or from $p$; so $x\in C(p,q)$ if and only if $x$ is in the half-space of $\Sscr$ bounded by $B(p,q)$ and containing $p$.}

This fact has an immediate and important consequence:

\theorem{8.2}{The Voronoi region $VP̌(p)$ is the
closed convex polytope $$V(p)=\inter{̌{q\in P-\set{p}} C(p,q). \eqno{(2)}$$}
\proof{By definition, a point $x$ is in $V(p)$ if and
only if $\dist(x, p)\leq\dist(x,q)$, i.e. $x\in C(p,q)$, for all $q\in P-\set{p}$. Equation (2) then follows; therefore, $V(p)$ is a convex polytope as defined in chapter 3 {\bf define convex polytope in ch.3 as the intersection of a finite number of half-spaces}, and is closed because each $C(p,q)$ is closed.}

As we saw before, if the dimension $k$ of $\Sscr$ is 1 the Voronoi regions are either closed intervals, half-rays, or the whole real line. If $k=2$, they are convex polygons; if $k=3$, convex polyhedra.\par

\sect{8.3. The Voronoi diagram in the plane.}

For the time being, let’s restrict ourselves to the planar case.
The figure below shows a set of
ten points and its Voronoi diagram:\fig5\par

Given a set $X$ of points, we say a circle is {\it $X$-free}
if there are no points of $X$ in its interior. Such circles will play an important role throughout the rest of this chapter; in particular, the properties below are worth mentioning:

\lemma{8.3}{A point $x\in \Sscr$ belongs to $m$
distinct regions of $VP̌$ if and only if it is the center
of a $P$-free circle passing through $m$ points of $P$.}
\proof{By definition, a point $x$ belongs to $m$ distinct
regions if and only if it is equidistant from the corresponding
$m$ data points, and no other data point is closer
to $x$ than they are. From this the lemma follows
immediately.}

\lemma{8.4}{If the finite, non-empty set $X=\set{x1̌, x2̌, \ldots, xň}$ is contained in the interior of an half-plane $H$ defined by the line $pq$, then there is a $X$-free circle passing through $p, q$, and some $xǐ$.}
\proof{For each point $xǐ$, consider the set $Dǐ$ defined as the intersection of $H$ and the circumcircle $Cǐ$ of $pqxǐ$.\fig4
The circular segments $Dǐ$ have $pq$ as their common chord and lie on the same side of it, so they can be linearly ordered by set inclusion. Let $Dǩ$ be a minimum of that ordering; by lemma 2.???D, $Dǩ$ wil contain none of the points $xǐ$ in its interior. Since $X\subset H$, we conclude that $Cǩ$ (that passes through $xǩ$) is $X$-free.}

\lemma{8.5 {\rm the ``bubble’’ lemma)}}{If there is a $P$-free circle $C$ through the points $p$ and $q$, and the arc $paq$ contains no points of $P-\set{p,q}$, then $a$ is in the interior of a $P$-free circle $C\pr$ passing through $p$ and $q$.}
\proof{Let $H$ be the half-plane determined by $pq$ and containing $a$. Let $X$ be the set of all points of $P$ that are interior to $H$. If $X$ is not empty, the preceding lemma gives an $X-free circle $C\pr$ passing through $p$, $q$ and some $x\in X$. If $X$ is empty, take any point $x$ in $H-C$, and let $C\pr$ be the circumcenter of $pqx$.\fig3
In either case, lemma 2.???D implies $C\inte H\subset C\pr\inte H$ and $C\pr-H \subset C-H$. We cnclude $C\pr$ is $P$-free and, since $x$ is not in $C\inte H$, the latter must be a proper subset of $C\pr\inte H$, {\bf ???} $a$ must be interior to $C\pr$.}

An informal way of stating this lemma is by saying that a $P$-free circle $C$ can be ``squeezed’’ between any two consecutive points of $P$ on its periphery, so that it will engulf the arc of $C$ that lies between those two points.

\digress{These results can be generalized to spaces of any dimension $k\geq 1$, by taking $k$-balls instead of circles, and taking $k$ points intead of two in lemmata 8.4 and 8.5.}

The interior of a region is called (for reasons that will be clearer later on) a {\it face} of the Voronoi. If $V(p)$ and $V(q)$ meet at a point $x$, then $x$ is equidistant from $p$ and $q$. So, $V(p)\inte V(q)$ is a (possibly empty) subset of the bisector $B(p,q)$. Since $V(p)$ and $V(q)$ are closed and convex, the same is true of their intersection; i.e. the latter is either the empty set,
a single point, a line, a closed ray, or a closed segment. The endpoints of $V(p)\inte V(q)$, if any, are called {\it Voronoi vertices}; the rest of $V(p)\inte V(q)$ is denoted by $e(p,q)$, and if not empty, is called the {\it Voronoi edge between $V(p)$ and $V(q)$}. There is a simple characterization of Voronoi vertices, edges and faces:\par

\theorem{8.6}{A point of the plane is a Voronoi vertex iff it belongs to three or more regions.}
\proof{First, the ``only if’’ part. Let $v$ be a Voronoi vertex, i.e. an endpoint (or the only point) of $V(p)\inte V(q)$, for some $p$ and $q$. Any point of the bisector $B(p,q)$ is equidistant from $p$ and $q$, and therefore is either in both regions or in neither of them. Any ball $S$ centered at $v$ will meet a point $x$ of $B(p,q)$ that is not in $V(p)\inte V(q)$; the point $x$ will be neither in $V(p)$ nor in $V(q)$, and therefore will be in some other region.\fig3
So, any ball centered at $v$ meets some region other than $V(p)$ and $V(q)$. Since the set of regions met by one of those balls is a monotonically decreasing function of its radius, we can conclude that some region $V(r)$ will be met by all such balls. Therefore, $v$ is on the boundary of $V(r)$; since regions are closed sets, we conclude that $v$ is in three distinct regions, namely $V(p)$, $V(q)$ and $V(r)$.

\hp Conversely, let $v$ be a point that belongs to $m\geq 3$ regions. By lemma 8.3, $v$ is the center $C$ of a $P$-free circle passing through $m$ points.
Let $p$ and $q$ be two consecutive points on that circle. There is an half-plane $H$ determined by $pq$ that does not contain any of the remaining $m-2$ points. Let $r$ be the open ray of $B(p,q)$ with origin $v$ and pointing into $H$.\fig3
By lemmata 8.5 and 2.???D, there is a $P$-free circle $C\pr$ passing through $p$ and $q$ whose center $c\pr$ lies in $r$. The point $c$ will belong to (at least) $V(p)$ and $V(q)$; by the convexity of the regions, the whole segment $vc\pr$ will also be in $V(p)\inte V(q)$, i.e. it will be part of a Voronoi edge.\par

\hp On the other hand, the remaining $m-2$ points will be interior to any circle $C\prr$ passing through $pq$ and with center $c\prr$ on $B(p, q)-r-\set{v}$. Therefore, $v$ is an endpoint of the Voronoi edge $e(p, q)$.}

\theorem{8.7}{A point of the plane belongs to a Voronoi edge iff it belongs to exactly two regions.}
\proof{By definition, apoint of the plane belongs to a Voronoi edge iff it belongs to two regions but is not a Voronoi vertex. By Theorem 8.7, the conclusion is immediate.}

\theorem{8.8}{A point belongs to a Voronoi face iff it belongs to exactly one Voronoi region}
\proof{The intersection of two Voronoi regions is contained in the bisector of the corresponding data points, and since the latter is a supporting line for either region, it follows that a point that is common to two regions is on the boundary of both. Therefore, a point interior to a region does not belong to any other region.

\hp Conversely, let $x$ belong to exactly one region $V(p)$. Suppose $x$ is on the boundary of $V(p)$; then any ball centered at $x$ would contain a point not in $V(p)$. Since every point of the plane belongs to some region, and the set of regions met by such a ball is a monotone function of its radius, there is some region $V(q)\neq V(p)$ that is met by all such balls. But then $x$ would be on the boundary of $V(q)$, and therefore in $V(q)$ itself, a contradiction. It follows that $x$ is interior to $V(p), i.e. belongs to a Voronoi face.}

The above theorems allow us to conclude that Voronoi edges, vertices and faces constitute a convex
subdivision of the plane, in the sense defined in chapter 5.
Therefore, $V$ has only
$ně\leq 3n-6$ edges and $nv̌\leq2n-4$ Voronoi vertices,
i.e. its complexity is only $O(n)$. This result is not so trivial
as it might seem at first glance: with $n$ data points we can
construct $O(n↑2)$ bissectors, so {\it a priori} we could have
$O(n↑2)$ edges and vertices.\par

As in chapter 5, $nv̌$ includes a fictitious {\it vertex
at infinity} $\omega$, whose purpose is to
ensure that that every edge, including infinite lines
and rays, has exactly two endpoints.
Theorem 8.5, like some of the results that follow,
cannot be applied to the vertex at infinity, at least not without extending the definitions in some way.\par

{\bf ??? digression: size of k-dimensional Voronoi}

Since the Voronoi diagram covers the whole plane with a finite number of regions, one or more of them must be infinite. These regions are easily characterized:\par

\theorem{8.9}{A Voronoi region $VP̌(p)$ is infinite iff $p$ is on the boundary of $\hull(P)$.}
\proof{Let $p$ be on the boundary of $\hull(P)$; by theorem
3.???SC, there is a supporting line
$\lscr$ of $P$ passing through $p$. Let
$r$ be the ray perpendicular to $\lscr$, with origin $p$, and
on the side of $\lscr$ opposite to $P$.\fig4 For any point
$x\in r$, and any point $q\in P-\set p$, we obviously have $\dist(x,p)\leq\dist(x,q)$, so $x\in Vp̌$. This implies
$r\subset V(p)$, so $V(p)$ is infinite.\par

\hp Conversely, let’s assume $V(p)$ is infinite. By
theorem 3.???R there must be some ray $r$
emanating from $p$ that is entirely in $V(p)$.
Let $q$ be any point of $P-\set{p}$,
and let $q1̌$ be the projection of $q$ onto the line of
$r$.\fig3
Suppose $q1̌$ is on the ray $r$ and
distinct from $p$; then, by taking a point $x$ on $r$
sufficiently far away from $p$, we can make
$\dist(x,p)>\dist(x,q)$, contradicting the fact that
$r\subset V(p)$. We conclude that $q1̌$ is on the ray
opposite to $r$, i.e. $q$ is in the half-plane bounded by the
perpendicular $\lscr$ to $r$ at $p$ and opposite to $r$.
Since this holds for all $q\in P-\set{p}$, $\lscr$ is a supporting line of $P$, and by lemma 3.???H
$p$ is on the boundary of $\hull(p)$.}

It is instructive to analyze the structure of the Voronoi
diagram for a small number of points. If $P$ has a single
point, the diagram is trivial: its single region covers the whole
plane. If $P=\set{p,q}$ the diagram consists of the two
half-planes $C(p,q)$ and $C(q,p)$.
For $n=3$, we will have either of the two cases below,
depending on whether the points of $P$ are colinear or
not.\fig3 Note that in case $b$ the single Voronoi vertex
(which is the circumcenter of the three points) may
lie inside or outside the triangle determined by them,
depending on whether the latter is acute or obtuse.\par

When $n=4$, the Voronoi diagram
will be equivalent (as a subdivision) to one of the cases below:\fig4
We have cases $a$ or $b$, respectively, when all four or
only three of the points are colinear.
We have case $c$ when the four points lie on the same
circle, case $d$ when one of the points falls inside the
triangle determined by the other three,
and case $e$ otherwise (i.e., when the four points
determine a strictly convex quadrilateral). Note that $b$ and $d$ are actually equivalent. Cases $a$ through
$c$ correspond to degenerate situations,
which occur with probability zero if the points are
random samples from a continuous
two-dimensional distribution.\par

\vfill\eject\end