\input CS445Hdr \def\file{CS445N8.tex of November 23, 1982 7:06 PM} \def\header{8. Closest-point problems} \def\title{Lectures 17--18} \chap{\header.} The basic {\it post-office problem} can be described as follows: given a {\it query point} $x$ and a finite set $P$ of distinct {\it data points}, find $p\in P$ such that $\dist(p,x)$ is minimum. We are interested mainly in the special case where $\dist$ is the usual Euclidean measure, and the points belong to the some $k$-dimensional real space $\Sscr$; in fact, we will be concerned almost exclusively with the planar case ($k=2$).\par The post-office problem has been studied by many famous mathematicians in the last two centuries {\bf ??? names and dates.} It has inspired many important concepts of Computational Geometry, and has a surprisingly large (and still growing) set of applications. {\bf ??? from Astronomy to Zoology.}\par The trivial algorithm for finding the closest point consist in computing all distances $\rset{\dist(p,x)\relv p\in P}$ and selecting the minimum. This requires $\Theta(n)$ time for points in the plane ($\Theta(kn)$ if the points belong to a $k$-dimensional space). Since every coordinate of every point must be examined, this algorithm is optimal.\par However, it is often the case that we have a great many different query points $x$ to search against a fixed data set $P$. As we might expect, by pre-processing the set $P$ we can reduce significantly the time required to answer each query.\par Although several algorithms and data structures have been proposed for solving this problem, the most useful and versatile (at least for the planar version) seems to be the ones based on the Voronoi diagram, whose study constitutes the bulk of this chapter. As we shall see, the Voronoi algorithms for the plane generally require $O(n\log n)$ preprocessing time, $O(n)$ space, and $O(\log n)$ time per query, in the worst case.\par Although their efficiency is significantly better than the exhaustive search, Voronoi algorithms and data structures are moderately complex, and their use may be hard to justify in some applications.\par An alternative that should be considered is the ``bucket'' algorithm (8.???), which is simpler, works for spaces of any dimension $k$, and is quite efficient on the average: it takes only $O(kn)$ preprocessing time, plus $O(2^k)$ time per query. Unfortunately, to derive these bounds we have to make some strong assumptions on the the probability distribution of $P$, and these assumptions hold only in very few cases. In the worst case, algorithm 8.??? performs no better than the trivial exhaustive search. \sect {8.1. The Voronoi diagram.} As a warm-up exercise, let's consider first the case where the space $\Sscr$ has dimension 1, i.e. is a straight line. We can think of each point as a real number (its coordinate from some fixed origin on the line), so the Euclidean distance between two points $p$ and $q$ is $\leftv p-q \rightv$.\par Let $p$ and $q$ be two data points with $p\dist(x,q)$, contradicting the fact that $r\subset V(p)$. We conclude that $q1$ is on the ray opposite to $r$, i.e. $q$ is in the half-plane bounded by the perpendicular $\lscr$ to $r$ at $p$ and opposite to $r$. Since this holds for all $q\in P-\set{p}$, $\lscr$ is a supporting line of $P$, and by lemma 3.???H $p$ is on the boundary of $\hull(p)$.} It is instructive to analyze the structure of the Voronoi diagram for a small number of points. If $P$ has a single point, the diagram is trivial: its single region covers the whole plane. If $P=\set{p,q}$ the diagram consists of the two half-planes $C(p,q)$ and $C(q,p)$. For $n=3$, we will have either of the two cases below, depending on whether the points of $P$ are colinear or not.\fig3 Note that in case $b$ the single Voronoi vertex (which is the circumcenter of the three points) may lie inside or outside the triangle determined by them, depending on whether the latter is acute or obtuse.\par When $n=4$, the Voronoi diagram will be equivalent (as a subdivision) to one of the cases below:\fig4 We have cases $a$ or $b$, respectively, when all four or only three of the points are colinear. We have case $c$ when the four points lie on the same circle, case $d$ when one of the points falls inside the triangle determined by the other three, and case $e$ otherwise (i.e., when the four points determine a strictly convex quadrilateral). Note that $b$ and $d$ are actually equivalent. Cases $a$ through $c$ correspond to degenerate situations, which occur with probability zero if the points are random samples from a continuous two-dimensional distribution.\par \vfill\eject\end (635)\f5