MessyCode.Mesa
Copyright © 1986 by Xerox Corporation. All rights reserved.
Spreitzer, January 17, 1986 12:12:21 pm PST
DIRECTORY Rope;
MessyCode: CEDAR DEFINITIONS = {
What's going on
We are concerned with establishing a 1-to-1 association between the strings of two languages. One language, call it Ll, consists of all the finite non-empty sequences of characters from some alphabet S. The other language, call it Ls, consists of all the finite non-empty sequences of characters drawn from S' (where S' S) that begin with a character from S'' (S'' S'). We want some restrictions on the association. We want the association to basically have the form of a bunch of (literal) pattern replacements --- that is, where pattern a appears in the string in Ll, pattern b appears in the string in Ls, for a bunch of client-specified a and b.
The technique used is as follows. We restrict the a's to be single characters. Ignore for the moment the restriction on the initial character in Ls. With each c (character from S) we associate some non-empty string b from S', where more than one c can be associated with the same string. We insist that there be some edge-labeled tree T, where every parent node has |S'| childward edges, each labeled with a different character in S', such that the set of paths is the set of bs.
To map from a string sl in Ll to a string ss in Ls:
Initialize ss to be the concatenation of the bs associated with the cs in sl.
Compute the correction of ss relative to sl.
Tag ss by the correction.
To invert:
Extract the correction and an initial sl from ss.
Correct sl by the correction.
The correction tells which c is actually associated with an instance of a b. The correction is a number that is almost a base n encoding of what to do for each c (but n varies from position to position). The correction is encoded in ss by a method similar to Tioga's method of representing formatting information in a file: as "trashy stuff" to the right of an "escape string".
We deal with the restriction on the initial character in Ls by using a different association from cs to bs until after some non-empty b is reached, and taking care of the possibility that no non-empty b is reached.
Now we say it again, precisely. We are defining a 1-to-1 relation, call it Code, between strings in Ls and Ll. This relation is determined by a client-specified function t: S Bool b S'*, which must obey certain restrictions. They are:
t(c, true)  Ls
Btrue = {b |  c t(c, true)=b}
Bfalse = {b |  c t(c, false)=b}
UniquePrefix(Btrue, S'', S')
UniquePrefix(Bfalse, S', S')
UniquePrefix(B, S0, Sr) {
B S0"Sr*
' s  S0"Sr : ! b  B, s'  Sr s = b"s'
(" is the concatenation operator)
We define Code in terms of two other 1-to-1 relations:
Code(ss, sl) {  sm  Lm, d  N S(sm, sl, d, true) ' T(ss, sm, d)
Lm = B0"Br*
N is the set of non-negative integers
The relation S describes the substitution of bs for cs and the computation of the correction. The relation T describes how to tag a string by a correction.
S(sm, sl, d, i) =
sl=e ' sm=e ' d=0 (where e is the empty string)
(  c  S, slr  Ll, b  Bi, smr  Bfalse*, dr  N
sm=b"smr ' sl=c"slr ' t(c, i)=b
' d = idx(c, i) + nc(b, i) * dr
' S(smr, slr, dr, false)
idx(c, i) = |{c'  S | c' < c ' t(c, i)=t(c', i)}|
(using any total order < that's convenient)
nc(b, i) = |{c  S | t(c, i)=b}|
Auf Cedar
ROPE: TYPE = Rope.ROPE;
Alphabet: TYPE = REF AlphabetPrivate;
AlphabetPrivate: TYPE = RECORD [
chars: ROPE,
includes: ARRAY CHAR OF BOOL
];
Code: TYPE = REF CodePrivate;
CodePrivate: TYPE = RECORD [
to: ARRAY Language OF DecisionTree,
dangles: ARRAY Language OF INT,
size: INT
];
Language: TYPE = {A, B};
DecisionTree: TYPE = REF DecisionTreePrivatePrivate;
DecisionTreePrivate: TYPE = RECORD [
variant: SELECT kind: * FROM
branch => [a: ARRAY CHAR OF DecisionTree ← ALL[NIL]],
leaf => [r: Relation],
ENDCASE];
Relation: TYPE = REF RelationPrivate;
RelationPrivate: TYPE = RECORD [
codeword: ARRAY Language OF ROPE,
index: INT];
OtherLanguage: ARRAY Language OF Language
= [A: B, B: A];
Translate: PROC [code: Code, source: ROPE, to: Language] RETURNS [translated: ROPE];
}.