-- []<>Users>sturgis.pa>cirio>typeNotes.tioga
-- Sturgis, March 4, 1989 12:41:05 pm PST
remark
There is a folder amongst the Cirio stuff that contains assorted corrections to the contents of this note. They must be installed to move towards correctness.
intro
This is an attempt to justify the type conformance checking algorithm I am about to install in Cirio. The issue is to get REF INT, REF VariantRecord, REF ANY, REF REF INT, REF REF VariantRecord, and REF REF ANY to behave correctly.
We use a fairly naive model of types. A type t defines a set of values. However, this set is dependent on the run-time state. If we let D stand for this dependency, then D(t) is the set of values defined by type t.
(One should remember that Cedar types contain more information than that contained in t and D(t). In this note we are ignoring most of this other information.)
Values can be divided into two broad classes: atomic and compound. Examples of atomic values are integers and pointers. Examples of compound values are records, arrays, and procedure descriptors. The atomic values can be further divided into pure and indirects. Integers and Booleans are pure. Pointers and REFs are indirects. In addition, handles held by Cirio on target world state are indirects. Given an indirect to a compound value, one can obtain an indirect to a component of the value.
Given a type t, there is an associated type I(t). Informally, D(I(t)) is the collection of indirects to target world data structures which must hold values of type t. That is, one would not be surprised if, in looking at such a data structure, one found a value of type t. Further, as time passes, one would expect to find different values, but all of type t.
The purpose of this note is to address a particular subtlety. Let us introduce the binary, associative, commutative operation, *, that applies to types. We require that D(t1*t2) = D(t1) * D(t2), for all D. We also introduce the relation , where t1 t2 requires that D(t1) D(t2) for all D.
Suppose that t = t1*t2, and let us consider I(t), I(t1), and I(t2).
The important fact is that, if D(ti) ` (D(t1) * D(t2)) for both i, then I(t) ` I(t1) * I(t2). For simplicity, we shall assume in the sequal that each D(ti) is non empty and that they are disjoint.
If p  D(I(t)), and one examines the data structure to which it points, one would not be surprised to find values either D(t1) or D(t2). Over time, examining this same data structure, one would not be surprised to find values in D(t1) at some times and values of D(t2) at other times.
On the other hand, if p  D(t1) * D(t2) and one examines the data structure to which it points, one would never expect to find values in D(t1) at some times and values of D(t2) at other times. The values would always be in D(t1) or always be in D(t2).
This distinction is important in Cedar because the REF type constructor sometimes produces I(t) and sometimes produces I(t1) * I(t2).
Cedar Types
We assume that a Cedar type can be described as a family of types: T = {t1, t2, ...}. Let D(T) be the (configuration dependent) set of values of this type. D(T) = D(t1) * D(t2) * ... .
We define *T = *{t | t  T}. Thus, *T is a regular type.
The intention is that when an indirect is formed, it will contain distinct indirect types for each of the types in the family.
Cedar Type Constructors
We consider three type constructors: Record, Indirect, and REF.
We assume given the record type constructor for ordinary types: <id1: t1, id2: t2, ...>. For Cedar types we define <id1: T1, id2: T2, ...> by:
 <id1: T1, id2: T2, ...> = {<id1: t1, id2: t2, ...> | ti  Ti for all i }.
We define the indirect Cedar type I(T) by I(T) = {*tTI(t)}
(We define REF later.)
Assignment Safety
We take the view that one always stores through an indirect. Thus, the assignment
aexp
is implemented by obtaining an indirect to the variable a, a value for the expression exp, and then storing the value through the indirect. Further
r^ ← exp
(where r is a REF) is implemented by obtaining the indirect contained in r, a value for the expression exp, and storing the value through the indirect.
Now we come to the crucial point. Given the Cedar type of an indirect, the set of values which one may safely store through that indirect may be considerably smaller than the set of values that one might find when loading from it. Let T = {I(t1), I(t2), ...}. Then the set of values that one can safely store through an indirect of type T is contained in D(t1) ) D(t2) ) ... . However, the set of values which one might load through such an indirect includes D(t1) * D(t2) * ... .
We define two type operators. LTarget(I) is intended to be the Cedar type of the values which may be safely stored through an indirect of Cedar type I, and RTarget(I) is intended to be the Cedar type of the values which might be obtained when loading from an indirect of Cedar type I. We define these operators for certain indirect types by
 RTarget(I(T)) = T
 if |T| > 1, then LTarget(I(T)) = {}
 if |T| = 1, then LTarget(I(T)) = T.
We can show that these definitions are suitable as follows. Suppose that u has been loaded thorough an indirect of Cedar type I(T), where T = {I(t1), I(t2), ...}. Then u  (D(t1) * D(t2) * ... ) = D(T) = D(RTarget(I(T)). Suppose that |T| = 1. In this case, the values u that are safe to store thorugh the indirect include D(t1), and one can see that D(LTarget(I(T))) is contained in this set. On the other hand, if |T| > 1, then D(LTarget(I(T))) = D{} = , which is certainly contained in D(t1) ) D(t2) ) ... .
Conformance
Let Conforms(t2, t1) be a previously defined primitive notion. In fact, let it mean D(t2) D(t1), for all D. (i.e., t1 t2.)
For Cedar types, define Conforms(T2, T1) by Conforms(*T2, *T1).
Observe that Conforms(T2, T1) is equivalent to D(T2) D(T1), for all D. This follows from the following argument:
D(T) = *{D(t) | t  T} = D(*{t | t  T}) = D(*T)
Hence
Conforms(T2, T1) <=> Conforms(*T2, *T1) <=> D(*T2) D(*T1) for all D
<=> D(T2) D(T1) for all D
Further, we assume that Conforms satisfies:
If Conforms(I(t1), I(t2)), then Conforms(t1, t2) and Conforms(t2, t1).
[This follows from an argument that legal configurations must lead to legal configurations .... ???] [Is it even possible that it is a definition of Conforms(I(t1), I(t2))?]
[Do Conforms(t1, t2) and Conforms(t2, t1) together imply Conforms(I(t1), I(t2))??]
Given these definitions and assumptions, let us compute Conforms(I(T1), I(T2)).
Conforms(I(T1), I(T2))
 <=> Conforms({*tT1I(t)}, {*tT2I(t)})
 <=> Conforms(*{*tT1I(t)}, *{*tT2I(t)})
 <=> Conforms(*tT1I(t), *tT2I(t))
We offer two special cases:
(1) if both T1 and T2 are singletons (i.e., T1 = {t1} and T2 = {t2}), then this reduces to:
<=> Conforms(I(t1), I(t2))
<=> Conforms(t1, t2) and Conforms(t2, t1)
<=> Conforms(T1, T2) and Conforms(T2, T1)
(2) if T1 T2 then
*tT1I(t) *tT2I(t)
*tT1D(I(t)) *tT2D(I(t))
=> D(*tT1I(t)) D(*tT1I(t))
=> Conforms(*tT1I(t), *tT1I(t))
=> Conforms(I(T1), I(T2))
Ref Target Types
REF T is really a pointer to a record of two fields: <CODE(T), T>, where CODE(T) is a type containing a single integer value, the type code for T. We define RTT(T) to be this record type. RTT stands for Ref Target Type.
Variant Records
We consider exactly one class of variant records. It contains the individual types VR, red VR, big red VR, small red VR, and blue VR.
Some Cedar Types
There is some confusion in the following between the ordinary type INT and the Cedar type INT. Perhaps I should fix that.
INT = {INT}
VR = {big red VR, small red VR, blue VR}
red VR = {big red VR, small red VR}
big red VR = {big red VR}
RTT(INT) = {<CODE(INT), INT>}
RTT(VR) = {<CODE(big red VR), big red VR>, <CODE(small red VR), small red VR>,
<CODE(blue VR), blue VR>}
RTT(red VR) = {<CODE(big red VR), big red VR>, <CODE(small red VR), small red VR>}
RTT(big red VR) = {<CODE(big red VR), big red VR>}
RTT(ANY) = {<CODE(T), T> | any Cedar type T}
REF INT = I(RTT(INT))
= I({<CODE(INT), INT>}) = {I(<CODE(INT), INT>)}
REF VR = I(RTT(VR))
= {I(<CODE(big red VR), big red VR> * <CODE(small red VR), small red VR> *
<CODE(blue VR), blue VR>)}
REF red VR = I(RTT(VR))
= {I(<CODE(big red VR), big red VR>) * I(<CODE(small red VR), small red VR>)}
REF big red VR = I(RTT(big red VR))
={I(<CODE(big red VR), big red VR>)}
REF ANY = I(RTT(ANY))
= {*Cedar type TI<CODE(T), T>}
REF REF INT = I(RTT(REF INT))
REF REF ANY = I(RTT(REF ANY))
Some assignments
Given the above definitions we can consider the legality of some assignments. Remember that
aexp is safe when Conforms(type(exp), LTarget(I(type(a)))) and
r^ ← exp is safe when Conforms(type(exp), LTarget(type(r))).
Assume the following declarations
i: INT;
ra: REF ANY;
ri: REF INT;
rvr: REF VR;
rbrvr: REF big red VR;
rra: REF REF ANY;
rri: REF REF INT;
ra^ ← i could be shown safe if
Conforms(REF INT, LTarget(REF ANY))
<=> Conforms(REF INT, LTarget(I(RTT(ANY))))
<=> Conforms(REF INT, {}) {because |RTT(ANY)| > 1}
is false (should I work it out?)
ra ← ri is safe if
Conforms(REF INT, LTarget(I(REF ANY)))
<=> Conforms(REF INT, REF ANY) {because |REF ANY| = 1}
<=> Conforms(I(RTT(INT)), I(RTT(ANY))
which follows from
{<CODE(INT), INT>} {<CODE(T), T> | any Cedar type T}
=> RTT(INT) RTT(ANY)
=> Conforms(I(RTT(INT)), I(RTT(ANY))
rra ← rri could be shown safe if
Conforms(REF REF INT, LTarget(I(REF REF ANY)))
<=> Conforms(REF REF INT, REF REF ANY) {because |REF REF ANY| = 1}
<=> Conforms(I(RTT(REF INT)), I(RTT(REF ANY)))
<=> [Conforms(RTT(REF INT), RTT(REF ANY))
and Conforms(RTT(REF ANY), RTT(REF INT))]
{because RTT(REF INT) and RTT(REF ANY) are both singletons}
which requires at least
Conforms(RTT(REF ANY), RTT(REF INT))
which requires at least
Conforms(CODE(REF ANY), CODE(REF INT))
which fails.