[Cyan]<PreISLCedar>Stolfi>TH>Rounding.TeX!1

\input TR10WideFormat \input GeoMacros

\begfile{Rounding.TeX of March 23, 1984 1:51:01 am PST --- Stolfi}

\titlestuff{

\title{Computation Model, Rounding, and related topics}
\shorttitle{Rounding}

\author{Jorge Stolfi}

\abstract{Computational model, rounding, and related topics.}

}

% ADDITIONAL MACROS

% reference tags

\def\GeoText{GeoText}

% search macros

\def\Prob{}

% miscellaneous

\def\footprobwidth{432} % width of text in foot problems
\def\footprob#1#2{
\botinsert{\ctrpar\pagewidth pt{\framebox{\small
\hbox par \footprobwidth pt{{\bf Exercise #1}: #2}}} }}
\def\^#1{\vbox to 1em{}^#1\!}
\def\naturals{\hbox{\bi N\/}}

\section{\Sec0. Computational model}

\subsection{\Sbsec{\Sec2.7}. Discrete {\it vs.} continuous problems}

\capital Concrete applications of geometric algorithms tend to fall in two broad classes. In the class of {\it discrete geometry}, all dimensional parameters (such as coordinates, lengths, slopes, and so forth) that occur as inputs, outputs, or intermediate results of the algorithm are integer, and ``small'' in a sense soon to be explained. Problems of this class are rather common, for instance, in the design of large integrated circuits. The geometry of the latter can usually be described as a large collection of rectangles, with edges parallel to a pair of orthogonal axes, and whose coordinates and dimensions are all integer multiples of a common unit, about $1/10000$ of the maximum dimension of the circuit. Discrete geometry problems are common also in digital image processing, and in ???.

The balance of the applications fall into the class of {\it continuous} geometry problems, where some (or all) of the geometric data is most naturally interpreted as real numbers. This definition includes most of the geometric problems occuring in solid modeling, cartography, statistics, numerical analysis, and many other disciplines.

The distinction between the two classes is of great practical importance. In discrete problems, the limited range of the data can frequently be exploited to greatly reduce the cost of the solution. For example, it is well known that any algorithm to sort $n$ real numbers requires $\Omega(n\log n)$ time in the worst case, under quite generous accounting schemes. However, if the numbers are known to be integers in the range $0$ to $m$, the problem can be solved in $O(m)$ time, by using each number as an index into a table of $m$ elements.

This shortcut is not advantageous if $m$ is much greater than $n\log n$, or if the table is too large to fit in the available storage. In fact, it is primarily the existence and applicability of shortcuts of this type that determines whether a problem should be classified as continuous or discrete.

Were it not for these considerations, it would be hard to draw a line between the two classes. In the applications we cited above as examples of the continuous class, dimensional parameters are almost always approximated by floating point numbers of finite precision. Such numbers have a maximum magnitude and are all integer multiples of some small unit; we could in principle do a simple mental scaling and think of those problems as being ``discrete'' after all. However, the huge integers produced by such a scaling will almost certainly rule out the use of any discrete technique. For example, consider again the sorting problem discussed above: if the inputs are eight-digit floating point numbers between 1 and 2, the scaling mentioned above would map them to integers between $10{,}000{,}000$ and $20{,}000{,}000$. The table technique would be hardly useful here.

<Mumble: The same observation applies to rational and integer/homogeneous representations we will introduce later on (even though we argue that rational numbers generally dont get big, they are not small either), and to discretely-posed problems where discrete techniques don't help. Such problems should be viewed as continuous: think of the dimensions and coordiantes as real numbers, since real geometry is much nicer and intuitive (more visual). Questions of rounding (if floating-pont) or number size (if exact rational) are better handled separately. Also, in the analysis it is more sensible to count real operations as unit cost.>

\subsection{\Sbsec{\Sec1.2}. The ``mathematicians dream machine''}

\capital A ``mathematician's dream machine'' (MDA) is an ideal computing device consisting of a {\it processor}, and three {\it memory units}, respectively {\it the input}, {\it working}, and {\it output} memories. Each memory unit is an infinite array of registers, each identified by a non-negative integer address and capable of containing an arbitrary integer value. The processor can perform in unit time a fetch or store opertation (direct or indirect), or any arithmetic or relational operation from some finite set, all of these on arbitrary integer operands. A MDA is thus equivalent to the log-RAM of [Ullman], except in its accounting \note{what about cost of input/output intensive tasks? what about quick programs that read only the last word of a long input, or that modify a random element in a big array? How do we define ``the function computed by such programs''? what about quick programs that use big uninitialized arrays, like Tarjan's element-uniqueness algorithm?}

It is convenient to think of our algorithms as programs for such a machine. Such an algorithm defines a partial function from $\integers\star$ to $\integers \star$, as follows: ...

It may seem that the MDA is somewhat unrealistic, because of its unlimited ``word size'' and its unit operation cost. For example, the sum of $n$ numbers of $m$ bits can be computed in $O(n)$ time on the MDA, but costs at least $\Omega(n(\log m+\log n))$ in the log-cost RAM model ($\log m$ average cost to add each number, $\log n$ cost to fetch it from the input store). The MDA can raise $m$ to the $n$th power at cost $O(\log n)$, whereas the log-RAM spends at least $\Omega(n\log n\log m)$. The MDA clearly seems unrealistically powerful, and it seems unlikely that efficiency measures under that model can say much about the practical efficiency of an algorithm.

However, for the algorithms we are going to see, quite the opposite is true: in most cases, not only the MDA efficiency measures make practical sense, but they usually make mmore sense than the log-RAM and other similar measures. Actually, as we shall see, for the algorithm sin this work the MDA and the log-RAM costs differ at most by a few [one?] $\log$ factors.

We will measure the efficiency of a MDA algorithm $A$ by three functions $m𡤊(n, l)$, $w𡤊(n, l)$ and $t𡤊(n, l)$ (we will drop the subscript $A$ when clear from the contex). These are respectively the maximum register address referenced, the maximum length of any address, operand, or result (in bits), and the maximum number of operations executed by the algorithm, for all inputs consisting of $n$ integers of $k$ bits each. Note that the computations of $m$ and $w$ should take into account also the inputs, outputs, and the program itself (that should be thought of as being stored into the work memory at the start of the execution). However, we are not going to consider infinite sets of programs, and we are not interested in $O(1)$ terms in the measure functions $m$, $w$ and $t$, so we may ignore the program altogether.

It is clear that $A$ will run without modification in a log-RAM in time $O(tw\log m)$ and space $O(mw)$: \note{check!} More importantly, these measures say that the algorithm as given will work for any $(n, l)$ input in a ``real'' computer that has at least $m+O(1)$ words of $w+O(1)$ bits, and will run in time $ct+O(1)$ for a constant $c$ that depends on the machine. For example, if $w=2l$ then the intermediate integer results can grow to at most twice the length of the input data.

<inputs, outputs, and intermediates will usually be rationals or homogeneous coordinates with integer components. Then l will define the length of their numerator/denominator/coordinates.>

<talk about specifying $w$ separately for each variable. Think it better.>

(generally for this class of machines is an algorithm (in the classical sense) that takes as input a finite sequence of natural numbers, and outputs those numbers in increasing order. Let be respectively the maximum memory address, the maximum integer operand or result, and the maximum number of operations that occur in the execution of the algorithm for any sequence of $n$ numbers not greater than $K$. Any algorithm for this class machine that correctly sorts all sequences of $n$ integers less than $K$, where $n\leq K\leq M$, must take $\Omega

A somewhat more realistic model is the following. Consider a class of RAM machines dependent on two parameters $M$ (``memory size'') and $W$ (``word size''). A particular machine in that class machine has $M$ memory elements capable of storing integers up to $W$, and is able to do fetch and stores (direct or indirect), comparisons, and arithmetic operations (including integer division) on numbers of that size at unit cost.