\oh{Eric,}
I had trouble typesetting the remarks, too; it seems TSetter can't handle some of the weirder looks I used, or perhaps there were too many of them in the same document. Mostly for my own benfit, I converted the file to TeX, and in the process added a few more details. I also switched to the post-superscript notation ($p^G$ instead of $\^Gp$), under the assumptyion that I may end up using this stuff in Leo's notes; sorry for that.
I did a first and fast pass through the rest of you paper, but not in much depth. The only problem I have noticed is in the organization of the material. As Leo keeps telling me, it is important the reader some sort of ``advance warnings'', for example by stating what are the goals of each section at the the beginning of the same.
I would also suggest you make more explicit the structure of SolidViews trees, in particular the distinction between the {\bf frame} and the {\bf transformation matrix} of an object. As I understand it, the matrix is what gets actually stored and manipulated, and it describes the object's frame relative to its parent. The object's frame is defined {\bf implicitly}, with respect to WORLD, by the product of all the matrices in the path from the root to the object. This is crucial for the uinderstanding of the paper, and it doesn't seem to be sufficiently obvious (true, solid modeling is not precisely my area...)
Also, it is not clear to me what is the central point of section 6: is it the mathematics (how to define the operation on Boat that causes TugBoat to map into Dock), is it the application (making two surfaces tangent to each other), or is it the user interface (Dock and Tugboat defined by pointing)?
This now is a matter of taste, but I would try to describe the mathematics of homogeneus coordiantes independently from their use in SolidViews, perhaps even in a separate section. That is, I would have an additional section 3$1\over2$ containing the topics suggested in the previous paragraph, and an overview of sections 4,5,6. Something like this:
\hangindent40pt after0\noindent We will now describe three ways of specifying frame transformations we have found particularly useful in SolidViews. In {\it relative positioning} we modify a frame $G$ by directly specifying a ``primitive'' transformation relative to another frame $H$, that is, a rotation, translation of scaling relative to $H$. In {\it absolute positioning}, the transformation is specified implicitly, as ``the (most natural) mapping that makes frame $G$ to lie in a specific position with respect to another frame $H$''; for example, we may want to make $G=H$, or the axes of $G$ parallel to those of $H$ while preserving the origin of $G$. Finally, in {\it indirect positioning}, the transformation on $G$ is specified as the one that takes a frame $H$ to a given position with respect to a third frame $K$. Such operations arise naturally in the process of moving one object so that its surface becomes tangent to some other object, as discussed in detail in section 6$1\over2$.
Maybe the above is not a correct interpretation of your aims. If it is, then I would object mildly to the names of the second and third cases. ``Absolute positioning'' seems misleading, since the new position of $G$ is in general defined relative to some other frame $H$. I thought it may be called ``implicit positioning'', but that may be even more confusing. Also, the third class (``indirect positioning'') seems to be an extension of the second one.
Perhaps the problem is that I am right now (October 19, 1983 6:26:47 pm) thinking of ``positioning'' as ``defining a transformation'' rather than ``defining a new frame''. In the latter interpretation, the names do make more sense, and the difference between class 2 and class 3 is substantial. The question then becomes how to coach the reader to think the ``right'' way from the start.
Attached you will find my original comments, a bit expanded and edited. Again, don't take them too seriously. In particular, I DO NOT suggest you put that much math detail in your paper, and certainly not any of the philosphical stuff --- that is pure hex dump.
Enjoy (if you can). See you,
\oh{Eric,}
I had to give my Dealer this week, so I couldn't read the ``business'' part of your paper yet. However, I have read the ``formalities'' sections (1 to 3) and have a lot to say about them. Please forgive me; I fear I have become more of a mathematician than a computergraphician, and therefore I have got rather strong opinions about such vital matters as the location of superscripts and the order of terms in a sum.
Moreover, I am myself working on the ``formalities'' section of the computational geometry book, and in particular on homogeneous coordinates. I can give you a copy of my scribblings if you wish, but there isn't much in them: I only got so far as to define points, lines, planes, etc; no transformations yet, not even vectors. In fact, by reading your paper and trying to write my comments I came to understand those concepts a lot better (the differences between vectors and points, for example). Many of the comments below I wrote mostly for my own benfit, to get the things straightened out in my mind (possibly for use in the book).
Here are my suggestions/comments/ramblings, section by section. The notation 3.2/4/1--3 means section 3.2, paragraphs 4, line 1 to 3.
\parindent 0pt
\vskip20pt
{\bf Sections 1 and 2:}
I believe these two sections would be better if combined into a single ``Introduction". Section 2 could go after 1./2.
Paragraphs 1./3 and 1./4 should probably be transposed.
1./3/1: ...schemes, --> ...schemes.
1./4/6--7: ... their surfaces relatively ... better: ...their surfaces relative to each other ...
2./2-3: what is the difference between 1 and 2? Isn't item 2 relative to B, too?
\vskip20pt
{\bf Section 3:}
\vskip20pt
{\bf $***$ WARNING: READING THE NONSENSE BELOW MAY BE HAZARDOUS TO
YOUR MENTAL HEALTH $***$}
Section 3 apparently suffers from a kind of ``chicken and egg'' dilemma:
Who comes first, points or coordinates? In other words, which of the views of ``space'' are you taking as the starting point:
The ``Cartesian'' view:
\item Space is simply R3.
\item There is a natural WORLD frame: position (0,0,0), vectors (1,0,0), (0,1,0), (0,0,1).
\item A point, a vector, and the coordinates of a point wrt WORLD are the same thing.
\item There is a notion of ``true'' orthogonality (without reference to a frame): xX+yY+zZ=0. As a consequence you can compare non-parallel distances, and talk of a ``rotation", also without reference to a specific frame.
\item There is a natural unit of distance: dist((000),(100))=1.
The ``Classical Physics'' or ``Euclidean'' view:
\item Space is physical space, or anything ``isomorphic'' to it.
\item Points are locations in this space.
\item Vectors are the same as translations: you can add (compose) two vectors, or add (apply) a vector to a point, or subtract two points and get a vector, but not add two points, or talk of the ``tip of a vector".
\item There is no distinguished origin or direction: we may think of the WORLD frame as being chosen arbitrarily by the user.
\item SolidViews knows about and manipulates only the coordinates of points and vectors and frames wrt some frame (ultimately, WORLD), but the points and frames themselves are physical objects external to SolidViews.
\item There is a notion of ``true'' orthogonality, so e.g. you may ask the user to choose an orthogonal basis to be the WORLD. This also implies one can compare lengths even of non-parallel segments, so you can ask the user to pick WORLD with axes of equal length. You can also talk of a ``rotation'' without reference to a specific frame.
\item There is no ``absolute'' unit of length, however, so you cannot ask the user to take WORLD with ``unit length'' vectors. Length and distance are always relative to some frame (see below).
The ``Linear Algebra'' ("Einstenian?") view:
\item Same as ``classical physiscs", except that there is no notion of ``true'' orthogonality: You cannot even ask the user to choose WORLD orthogonal.
There are probably many other viewpoints you can think of, but my impression is that most of the time geometers/mathematicians/physicists/computergraphicians are talking about one of these. The Cartesian view is more economical and simpler in many ways, but the Euclidean view seems to be the most natural for non-mathematicians (You seem to be assuming this view, except in a couple of places.)
Of course which one you pick makes no real difference from the practical or mathematical point of view, but it determines the language you are allowed to use
in the ``basic notions'' section. For example, in the Cartesian viewpoint you define WORLD as being the frame $\la(000),(001),\ldots\ra$; in the Euclidean and Einstenian views you say WORLD is some arbitrary frame chosen by the user, and (in the Euclidean view only) you may require it to be orthogonal and with axes of equal length. Also (to be really nit-picking) in these two viewpoints the ``coordinates of a vector'' must be defined independently of the ``coordinates of a point", since vectors and points are very different beasts.
PS: If this seems nonsense to you, don't worry: it seems nonsense to mee, too, but I hoped that by writing it up I could see things more clearly (I still don't).
{\bf $***$ END OF conscious NONSENSE $***$}
My preferred way of looking at frames and coordinates, according to the ``Euclidean'' viewpoint and your current notation (slightly extended), would be something like this:
Begin
----------------------------------------------------------------------------------
A coordinate frame $G$ consists of a point $\^oG$ of three-space (the reference point or origin of the frame), and three linearly independent vectors $\^xG,$ $\^yG,$ $\^zG.$ It can be visualized as three lines (the x,y, and z coordinate axes of $G$) passing through $\^oG$ and parallel to those vectors.
Let $p$ be a point of three space. The x,y,and z coordinates of $p$ in the frame $G$ are the real numbers $p^Gx, p^Gy, p^Gz$ such that $p = p^Gx\; \^xG + p^Gy \;\^yG + p^Gz\; \^zG + \^oG$ ($+$ here is the ``physical'' operation of composing two vectors (=translations), or applying a vector (=translation) to a point of space). We will write these coordinates in homogeneous form, that is, in the form of a 1x4 array $[p^Gx\;\; p^Gy\;\; p^Gz \;\; 1]$, that will be denoted by $p^G$.
The coordinates of a vector $v$ in the frame $G$ are the three real numbers $v^Gx, v^Gy, v^Gz$ such that $v = v^Gx\; \^xG + v^Gy\; \^yG + v^Gz \;\^zG$ (note: without $\^oG$). In homogeneous form, the coordinates of a vector are written as the 1x4 array $v^G = [v^Gx\;\; v^Gy \;\; v^Gz \;\; 0]$ (note: last element is 0, not 1).
To obtain the coordinates of the point $p$ and of the vector $v$ in a different frame $H$, we post-multiply $p^G$ and $v^G$ by the 4x4 array
$$G^H=\left[\matrix4{
(\^xG)^Hx (\^xG)^Hy (\^xG)^Hz 0\cr
(\^yG)^Hx (\^yG)^Hy (\^yG)^Hz 0\cr
(\^zG)^Hx (\^zG)^Hy (\^zG)^Hz 0\cr
(\^oG)^Hx (\^oG)^Hy (\^oG)^Hz 1\cr}\right]$$
where $(\^xG)^Hy$ (as defined in the previous paragraphs) is the y-coordinate of the vector $\^xG$ in the frame $H$, and so forth. That is, the rows of $G^H$ are the homogeneous coordinates of $\^xG, \^yG, \^zG$, and $\^oG$ in the frame $H$. The array $G^H$ is called the matrix of $G$ in the frame $H$.
Proof that the array multiplication really does what we claim: by carrying it out for a point p, we get the 1x4 array
$$\eqalign{
\left[\right. p^Gx\; (\^xG)^Hx + p^Gy\; (\^yG)^Hx + p^Gz\; (\^zG)^Hx +(\^oG)^Hx,\cr\vsk3
\null p^Gx\; (\^xG)^Hy + p^Gy\; (\^yG)^Hy + p^Gz\; (\^zG)^Hy + (\^oG)^Hy,\cr\vsk3
\null p^Gx\; (\^xG)^Hz + p^Gy\; (\^yG)^Hz + p^Gz\; (\^zG)^Hz + (\^oG)^Hz, 1]\cr}$$
We must check that
$$\eqalign{
\null (p^Gx\;(\^xG)^Hx + p^Gy\;(\^yG)^Hx + p^Gz\;(\^zG)^Hx + (\^oG)^Hx) \^xH\cr\vsk3
+ \; (p^Gx\;(\^xG)^Hy + p^Gy\;(\^yG)^Hy + p^Gz\;(\^zG)^Hy + (\^oG)^Hy) \^yH\cr\vsk3
+ \; (p^Gx\;(\^xG)^Hz + p^Gy\;(\^yG)^Hz + p^Gz\;(\^zG)^Hz + (\^oG)^Hz) \^zH + \^oH = p \cr}$$
This is just a matter of expanding the expression above and using the identities below (from the definitions):
$$\eqalign{
(\^xG)^Hx\; \^xH + (\^xG)^Hy\; \^yH + (\^xG)^Hz\; \^zH = \^xG,\cr\vsk3
(\^yG)^Hx\; \^xH + (\^yG)^Hy\; \^yH + (\^yG)^Hz\; \^zH = \^yG,\cr\vsk3
(\^zG)^Hx\; \^xH + (\^zG)^Hy\; \^yH + (\^zG)^Hz\; \^zH = \^zG,\cr\vsk3
(\^oG)^Hx\; \^xH + (\^oG)^Hy\; \^yH + (\^oG)^Hz\; \^zH + \^oH = \^oG,\cr}$$
and finally
$$p^Gx \^xG + p^Gy\;\^yG + p^Gz\;\^zG + \^oG = p.$$
We have proved our claim, namely $p^G\; G^H = p^H$.
Similarly, to obtain the matrix $K^H$ of an arbitrary frame $K$ in the frame $H$, we post-multiply $K^G$ by the matrix $G^H$:
$$K^G G^H = K^H$$
(We can easily prove this by considering the effect of post-multiplying separately each row of $K^G$ by $G^H$.)
Most of the matrix products we will consider are of the form $p^A A^B B^C C^D$... or $A^B B^C C^D$..., where the superscript of each term is the frame represented by the following term. For succintness, we will omit the superscripts in such products, and write just $pABC^D$ and $ABC^D$, respectively.
The {\it length} of a vector $v$ {\it relative to a frame} $G$ is by definition the real number $\sqrt{(v^Gx)^2 + (v^Gx)^2 + (v^Gx)^2}$. The {\it distance} between two points $p, q$ {\it relative to $G$} is the length (rel. to $G$) of the vector $p - q$. Two vectors $u,v$ are said to be {\it orthogonal relative to $G$} if $u^Gx v^Gx + u^Gy v^Gy +u^Gz v^Gz = 0$.
An {\it affine transformation} is a mapping of three space into itself that preserves parallelism (and therefore collinearity, ratios of areas, ratios of lengths along the same direction, etc.) We denote by $pF$ the image of a point $p$ under an affine map $F$. The image of a vector $v$ under $F$ is defined as being the vector $vF$ such that $(p+v)F = pF + vF$ for all points $p$. The image of a frame $H$ under a mapping $F$ results by applying $F$ to the origin and vectors of $H$. The coordinates of $pF$ or $vF$ in a given frame $H $can be computed by post-multiplying $p^H$ or $v^H$, respectively, by the 4x4 array
$$\left[\matrix4{
(\^xHF)^Hx (\^xHF)^Hy (\^xHF)^Hz 0\cr\vsk3
(\^yHF)^Hx (\^yHF)^Hy (\^yHF)^Hz 0\cr\vsk3
(\^zHF)^Hx (\^zHF)^Hy (\^zHF)^Hz 0\cr\vsk3
(\^oHF)^Hx (\^oHF)^Hy (\^oHF)^Hz 1\cr}\right]$$
where $(\^xHF)^Hy$ (as usual) is the y-coordinate of the vector $(\^xH)F$ in the frame H, and so forth. That is, the rows of this matrix are the homogeneous coordinates of $\^xHF, \^yHF, \^zHF$, and $\^oHF$ in the frame $H$ itself. This matrix is precisely the matrix that describes the frame $HF$ in the frame $H$, that is, $(HF)^H$. Conversely, every pair of frames $G$, $H$ defines an affine map, that takes $G$ into $H,$ and whose matrix is precisely $H^G$.
Furthermore, any $4\times 4$ matrix whose last column is $[0,0,0,1]^T$ is the matrix of an affine map in any given frame. It is worth pointing out that {\it the map depends on the frame, as well as on the matrix}; and, conversely, {\it the matrix depends on frame, as well as on the map}.
The {\it composition} of two affine mappings $F$ and $E$ is the mapping $FE$ that satisfies
$$ p (FE) = (pF)E $$
for all points $p$. It is easy to prove that such a vector exists, and also that its matrix in an arbitrary frame $G$ satisfies the equation
$$(FE)^G = F^G E^G.$$
A {\it translation} is an affine map that corresponds to adding a fixed vector $v$ to every point of space. Its matrix in a frem $G$ is
$$ \left[\matrix4{
1 0 0 0\cr\vsk3
0 1 0 0\cr\vsk3
0 0 1 0\cr\vsk3
vx^G vy^G vz^G 1\cr}\right] $$
We denote the matrix above by $\Tran(vx^G, vy^G, vz^G)$. We remark again that an expression such as $\Tran(0, 10, 10)$ by itself denotes only a {\it matrix}, not a transformation. In order to interpret that matrix as a map, we must specify also a reference frame.
The image of a vector $v$ by an affine mapping $F$ is by definition the vector $vF$ such that for any point $p$ we have $(p+v)F=pF+vF$. The coordinates of that vector in a frame $G$ are given by
$$(vF)^G = v^G F^G.$$
An affine map is said to be a {\it scaling relative to a frame $G$} if its matrix in that frame has the form
$$ \Scale(\lambdax, \lambday, \lambdaz) = \left[\matrix4{
\lambdax 0 0 0\cr\vsk3
0 \lambday 0 0\cr\vsk3
0 0 \lambdaz 0\cr\vsk3
0 0 0 1\cr}\right] $$
Such a transformation maps the point with coordinates $p^G = [px^G\;\; py^G\;\; pz^G\;\; 1]$ to the point with coordinates $p^G \Scale(\lambdax, \lambday, \lambdaz) = [\lambdax\cdot px^G\;\; \lambday\cdot py^G\;\; \lambdaz\cdot pz^G\;\; 1]$, hence its name. The scaling is said to be {\it uniform} if $\lambdax=\lambday=\lambdaz=\lambda$; in that case, we write the matrix above simply as $\Scale(\lambda)$. A uniform scaling with respect to $G$ simply ``enlarges'' or ``shrinks'' all distances and lengths (relative to $G$) by a factor $\lambda$, preserving directions and orthogonality (relative to $G$). If the axes of $G$ are pairwise orthogonal with respect to a frame $H$, then a uniform scaling by $\lambda$ relative to $G$ is also a uniform scaling by $\lambda$ relative to $H$.
An affine transformation that keeps the origin of $G$ fixed and preserves all distances between points (rel. to $G$) is a {\it rotation relative} to $G$. Of particular interest are the {\it rotations around the $x, y$, or $z$ axis of $G$}, that keep fixed all points in the specified axis. These rotations have the matrices
$$ \XRot(\theta)=\ldots ,$$
$$ \YRot(\theta)=\ldots ,$$
and
$$ \ZRot(\theta)=\ldots ,$$
Note that an affine map $F$ that is a rotation relative to a frame $G$ may not be a rotation relative to another frame $H$. A sufficient condition for this to be true is that the axes of $G$ be all of the same length and pairwise orthogonal relative to $H$; that is, $G$ be obtained from $H$ by a translation + rotation + uniform scaling (relative to $H$).
End
----------------------------------------------------------------------------------
Other observations/suggestions not included in the above:
3.2/1: move this to the very end of ``Terms and Notations'' section (until then, introduce each new variable in full: ``Let q be another point, H be another frame...").
3.2/2: (Since we will...) I have trouble understanding this. seems a rather ambiguous notation.
3.2/3: I guess the pre-superscript convention is a carry-over from Paul's notation. Since you do array products in reverse order, it would be more consistent to use post-superscripts for the reference frame. The formulas
$$\^Hp = \^Gp \^HG \quad\rmand \quad \^HK = \^GK \^HG$$
would become
$$p^H = p^G G^H\quad\rmand\quad K^H = K^G G^H$$
which are easier to memorize and to vocalize. Don't you agree that the aesthetic pleasure you will get contemplating those equations is worth the trouble of transposing those few thousands superscripts in your paper? True, you will have also to fix Cedar.Style to make sub+superscripts like $p^Hx$ work, but that will not take longer than a month or two... {\small Let not the bosses read this, but have you considered using TeX?}
3.2/4: I would try to avoid using ``vector'' as a synonym of ``$1\times N$ array", and reserve ``vector'' for the physical/geometrical concept (=translation). ``Row vector'' and ``column vector'' are slightly less ambiguous, but still disturbing. (Perhaps use just ``column'' and ``row"? yich!) Also, it seems to be superfluous to point out the notational differences between your paper and Paul's, except where it really matters (like the switch from column to rows). Also ``is assumed be a displacement from the coordinate frame to its left'' is confusing; better say ``denotes the coordinates of $p$ in the frame to its left".
3.3/1: Are all frames orthogonal (with respect to WORLD)? Is WORLD orthogonal? (if in the Euclidean view: are its vectors of same length?)
3.3/1, 3.3/last, and others: It would be better to collect all references to the way SolidViews uses frames into a new (sub?)section, separate from the ``pure math'' sections. The WORLD frame should be defined there, too. Other points that should be moved to such section:
--- Each node $\Gscr$ in the hierarchy has an associated frame $G$
--- SolidViews represents the frame $G$ by storing in the node $\Gscr$ the matrix $G^H$ of $G$, {\it relative to the frame $H$ of the parent node $\Hscr$}.
--- Therefore, if the frame $H$ is replaced by some other frame $H\pr=HF$, where $F$ is any affine map, $G$ automatically becomes $GF$, and the same happents to the frames of all other descendents of $\Hscr$
--- Leaf objects (spheres, cubes, etc) are defined in terms of coordinates relative to the object's frame, so if the frame is scaled, rotated, distorted the object suffers the same changes. For example, if $\Gscr$ is a ``cube'' object, the coordinates of its vertices {\em in its own frame $G$} are by definition always $(\pm 1, \pm 1, \pm 1)$. Relative to the WORLD frame, however, $\Gscr$ will be a cube distorted by an arbitrary affine map: a ``parallelohedron'', a solid with six parallelograms for faces. In particular, if $G$ is orthogonal with respect to WORLD, $Gscr$ will be a parallelepiped or ``block'', whose edges will be twice the lengths of $\^xG, \^yG$ and $\^zG$ (rel. to world). In the same way, a ``sphere'' object (which has radius $\null =1$ in its own frame) can be an ellipsoid of arbitrary semi-axes relative to another frame.
WELL This is enough ramblings for now. Please see if you can find something useful in all this trash (Good luck...)
See you tomorrow (maybe)
\signed{Jorge}