Chapter 1. IntroductionThis manual describes the Cedar language. It is organized into three major parts:Chapter 2: A description of a much simpler kernel language, in terms of which the currentCedar language is explained. This description includes a precise definition (32.2), and aless formal explanation of the ideas of the kernel and the restrictions imposed by currentCedar (32.3-2.9). 32.1 contains an overview or glossary, in which the major technicalterms used in the kernel are briefly defined.Chapter 3: The syntax and semantics of the current Cedar language. The semantics is givenprecisely by a desugaring into the kernel. It is also given more informally by English text.This chapter also contains a number of examples to illustrate the syntax.Chapter 4: The primitive types and procedures of Cedar. For each one, its type is given aswell as an English definition of its meaning. This chapter is organized according to the classhierarchy of the primitive types (34.1).In addition, there is a one-page grammar for the full language, a shorter grammar for the safelanguage, and a two-page language summary which includes the grammar, the desugaring, and theexamples from 3, The document you are reading is nearly complete. A fe missing sections containparagraphs in the style of this one.!_pj \jqMZ$rqX?W '/U8T(Q1!P&r q:N#rqLJrq6 JTrIBq! FI DE C ?sC =  TVk(THE KERNEL LANGUAGE22Chapter 2. The kernel languageThis document describes the Cedar language in terms of a much smaller language, which we willusually call the kernel or the Cedar kernel. Cedar differs from the kernel in two ways:It has a more elaborate syntax (3). The meaning of each construct in Cedar is explainedby giving an equivalent kernel program. Often the kernel program is longer or less readable; the Cedar construct can be thought of as an idiom whichconveniently expresses a common operation. Sometimes the Cedar construct has no real advantage, and thedifference is the result of backward compatibility with the ten-year history of Mesa and Cedar.It has a large number of built-in types and procedures (4). In the kernel language all ofthese could in principle be programmed by the user, though in fact most are provided byspecial code in the Cedar compiler. In general, you can view these built-in facilities muchlike a library, selecting the ones most useful for your work and ignoring the others.Unfortunately, the current Cedar language is not a superset of the kernel language. Many importantobjects (notably types, declarations and bindings) which are ordinary values in the kernel that canbe freely passed as arguments or bound to variables, are subject to various restrictions in Cedar:they can only be written in literal form, cannot be arguments or results of procedures, or whatever.The long-term goal for evolution of the Cedar language is to make it a superset of the kerneldefined here. In the meantime, however, you should view the kernel as a concise and hopefullyclear way of describing the meaning of Cedar programs. To help in keeping the kernel and current Cedar separate, reserved words and primitives of thekernel which are not available in current Cedar are written in SANS-SERIF SMALL CAPITALS, ratherthan the SERIF SMALL CAPITALS used for these symbols in current Cedar. Operator symbols of thekernel which are not in current Cedar are not on the keyboard.The kernel is a distillation of the essential properties of the Cedar language, not an entirely separateinvention. Most Cedar constructs have simple translations into the kernel. Those which do not (e.g.,some of the features of OPEN) are considered to be mistakes, and should be avoided in newprograms.2.2 defines the syntax and semantics of the Cedar kernel language, the former with a grammar,and the latter by explaining how to take a program and deduce the function it computes and thestate changes it causes. The remainder of the chapter is a commentary which explains the conceptsbehind the kernel. It also gives the restrictions imposed by the current Cedar language on the fullgenerality described here; for more on this subject, see 3. The meaning of the various built-inprimitives is given in 4. 2.9 describes the incompatibilities between the kernel language andcurrent Cedar, i.e., the constructs in Cedar which would have a different meaning in a kernelprogram. For the most part, these are bits of syntax which do not have consistent meanings incurrent Cedar; future evolution of the language will replace them with their kernel equivalents.Usually, terms are defined and explained before they are used, but some circularity seems to beunavoidable. 2.1 gives a brief summary of each major idea which may be helpful as a reminder.Both this and the explanations in 2.3-2.7 are given under five major headings, as follows:Values and computations: Application Value Variable Group Binding Argument The type system: Type Type-checking Mark Cluster Declaration ClassPrograms: Name Expression Scope Constructors RecursionConveniences: Coercion Exception Finalization Safety ProcessMiscellaneous: Allocation Static Pragmaf2tGGN q%_pj Y?qA Wrq@UcAS&QtgP <OR ULq MKv8I/,HnQ EC $0 C5' B;2. @ T ?3G =P <+2 9W 7|8uvuvuvuq 5tqtqtq< 4t8 1IS / rq .A tq= , )Z ( M &!; %N # W ! S zJ 9 rY GW  G ?Yrq4rq37rq. r q0 r qTVk(f2THE KERNEL LANGUAGE3The kernel definition in 2.2 follows the ordering of the kernel grammar given there.2.1 OverviewThis section gives a brief summary of the essential concepts on which the Cedar language is based.The explanations are informal and incomplete. For more precise but more formal definitions, see2.2; for more explanation, see 2.3-2.8.2.1.1 Values and computationsApplication: The basic mechanism for computing in Cedar is applying a procedure (proc for short) toarguments. When the proc is finished, it returns some results, which can be discarded or passed asarguments to other procs. The application may also change the values of some variables. In theprogram an application is denoted by (the denotation of) the proc followed by square bracketsenclosing (the denotation of) the arguments: f [first~x, last~y]. There are special ways of writingmany kinds of application: x+1, person.salary, IF x<3 THEN red ELSE green, x: INT_7.Value: An entity which takes part in the computation (i.e., acts as a proc, argument or result) iscalled a value. Values are immutable: they are not changed by the computation. Examples: 3, TRUE,"Hello", l x IN x+3; actually these are all expressions which denote values in an obvious way.Variable: Certain values, called variables, can contain other values. The value contained by avariable (usually called the value of the variable) can change when a new value is assigned to thevariable. In addition to its results, a proc may have side-effects by changing the values of variables.Nearly every type T has a corresponding variable type VAR T; values of type VAR T contain valuesof type T. Every VAR type has a NEW proc which creates a variable of the type. A variable is usuallyrepresented by a single block of storage; the bits in this block hold the representation of its value.Group: A group is an ordered set of values, often denoted by a constructor like this: [3, x+1,"Hello"]. Like everything else, a group is itself a value.Binding: A binding is an ordered set of [name, value] pairs, often denoted by a constructor like this:[x: INT~3, y: BOOL~TRUE], or simply [x~3, y~TRUE]. If b is a binding, b.n denotes the value of thename n in b. Argument: A group or binding constructor written explicitly after an expression denotes applicationof the value P denoted by the expression to the value a denoted by the constructor, called theargument. P is usually a proc, and a is a group or binding, which is bound to its domain declarationD to get the argument which is passed. In making this binding a is coerced, if necessary, to matchthe declaration: If it is a group, the names from D are attached to its elements to turn it into a binding.If a name in D is missing from a, a default value is supplied.If a value in a doesn't have the type required by D, it is coerced into another value which does.2.1.2 The type systemType: A type defines a set of values by specifying certain properties of each value in the set (e.g.,integer between 0 and 10); these properties are so simple that the compiler can make sure that procarguments have the specified properties. A value may have many types; i.e., it may be in many ofthese sets. A type also collects together some procs for computing with the value (e.g., add andmultiply).More precisely, a type is a value which is a pair:Its predicate, a function from values to the distinguished type BOOL. A value has type T ifT's predicate returns TRUE when applied to the value. f2t3EGGq _S [wX Wq!= VYB T% PrX Mw q %rqrqrq L'rq rq% J), ID G$xqxqxqxqxq$ Fxqx qtqxqtqxqtqxqxqtq Bwq> Ahrq rq *tq ?yqtqO  X/ =h4); 7 :`6 75w qr q(rq 5xqtqxqtqxqxqxq) 4-xq xq xq xqrqxqxqr 2qxzqxq /~r q5 -M ,vxqXutGqutqtqXxqutGqutqtqX )KwqrqExquq 'tqxqxqzqtqxqrqxq xq' &Cxq "DrX wqrq @rq  rq%$  .rq -xq bw q(r q Xrq Zrqxqxq r qxq Rx qxqx qxq yr qyqxqtqtqtqxqtqxqtqxq Jr qxqtqxqtq Erq B/0TVk(22.1OVERVIEW5Scope: A scope is a region of the program in which the value bound to a name does not change(although the value might be a variable, whose contents can change). For each scope there is abinding called ENV (for environment) which determines these values. A new scope is introduced (inthe kernel) by IN (after LET or l) or by a REC [...] constructor for a declaration or binding; e.g., LET x~3 IN x+5;LET fact~l [n: INT] IN IF n=0 THEN 1 ELSE n*fact[n1].Constructors: Brackets delimit explicit constructors for group, declaration or binding values. Theyall have the form [x1, x2, ...], and are distinguished by the form of the xi: an expression for a group;n: e for a declaration;n~e or n: e~e for a binding.Recursion: When names are introduced in a constructor in Cedar, this is done recursively: If v is bound to n in a binding constructor, then in expressions in the constructor n has the value v,rather than its value in the enclosing scope. Exception: argument bindings are non-recursive.If n is declared in a declaration constructor, then it may not be used in the constructor, unless there isan ordering of the declarations in the constructor such that a name is used only by later declarations.Exception: declared names may be used in the bodies of l-expressions in the constructor (see3.3.4)In the kernel, however, constructors are non-recursive unless preceded by RECDot notation: The form e.n looks up n in some binding associated with e, and does something withthe result. There are three cases:If e is a binding, e.n is just the value paired with n in e.If e is a type, e.n is e.Cluster.n.Otherwise, e.n is (De.n)[e], and e.n[more args] is usually (De.n)[e, more args].In all cases you are supposed to think of n as some property or behavior associated with e; e.ndenotes that property or evokes that behavior.2.1.4 ConveniencesCoercion: Each type cluster contains To and From procs for converting between values of the typeand values of other types (e.g., Float: PROC[INT]_[REAL]; this would be a To proc in REAL and afrom proc in INT). One of these procs is applied automatically if necessary to convert or coerce anargument value to the domain type of a proc; this application is a coercion. Each coercion has anassociated atom called its tag (e.g., $widen for INT_REAL or $output for INT_ROPE); severalcoercions may be composed into a single one if they have the same tag. The tags thus serve toprevent unexpected composition of coercions.Exception: There is a set of exception values. An expression e denotes a value which is either oftype De or is an exception. Whenever an exception value turns up in evaluating an expression, itimmediately becomes the value of the whole expression, unless (in the kernel) the expression hasthe form e BUT {...}. The {...} tests for exception values and can supply an ordinary value, oranother exception, as the value of the BUT expression. An exception value may contain an ordinaryvalue, called the argument of the exception, so that arbitrary information can be passed along withan exception.Finalization: When a variable is no longer accessible, the storage it occupies is freed (automaticallyin the safe language). Before this is done, a finalization proc in the cluster of the variable's type iscalled to do any other appropriate resource deallocation. The local variables of a proc or otherscope may also be finalized (using UNWIND).Safe: The safety invariant says that all references are legal, i.e., each REF T value is NIL or refers toa variable of type T. A proc is safe if it maintains the safety invariant whenever it is applied toarguments of the proper types. If a proc body (l-expression) is f2t;Gq _wq I ^Wrq' \uqr q> [O uqtqyq uq7 YuqXxquqxq XGuqxqyqxqtquqtqxqtqtqxqxqxq Uw qI SxS }SqxS }Sq1xS {Sq QX P&xqxq Nxqxqxqxqxq KwwqCr q IXxq xqBxqxq Ho] Fxqf Egg C7yq$ B_ @Hu =wq xq xq!xq <, :Xxqxqxqxq 9$xq xqxqxqxq 7 xqxqzxqxqxqxqxqzxqxqxq 6(xqxqx 4q' 0rX -nwqxqrq r q + xqtqtqzqtqxqtq *fxqtq*rq ()rq '^ rqxqtztqxqtztq %D $V% !+wqrqxq  zxqV # < xquq@ uq7  rqF  w q-, d,r q+ Z \tq 1wqrq0tqxq tq xq rq< )&yq`TVk(,THE KERNEL LANGUAGE26checked, the compiler guarantees that the proc value is safe;trusted, the programmer asserts that it is safe (but the compiler makes no checks), and the proc valueis treated as safe;unchecked, the compiler makes no checks and the proc value is unsafe. It is best to write checked code whenever possible. However, checked code cannot call unsafe procs(since the compiler then cannot guarantee safety). Process: Concurrency is obtained by creating a number of processes. Each process executes a singlesequential computation, one step at a time. They all share the same address space. Shared data(touched by more than one process) can be protected by a monitor; only one process can executewithin the procs of the monitor at a time. So that each process can know what to rely on, it isnecessary to establish an invariant for the monitored data which is established whenever a monitorproc returns or waits. A process can wait on a condition variable within a monitor; other processescan then enter the monitor. The waiting process runs again when the condition is notified, or after atimeout.2.1.5 MiscellaneousAllocation: Cedar has standard facilities for allocating new variables of any type (the NEWprimitive); related variables can be allocated in the same zone. Normally, variables are deallocatedautomatically by the garbage collector when they can no longer be referenced; such variables canonly be referred to by REFs. It is also possible to have variables which are deallocated explicitly byFREE, but this is unsafe.Static: An expression whose value is computed without executing the program is called static.Literals are static, as are names bound to literals, and any expression with static operands. Procbodies are never static unless they are inlinePragma: Some language constructs do not affect the meaning of the program (except possible tomake a legal program illegal), but only its time and space costs; these are called pragmas. Examplesare INLINE for proc bodies and PACKED for arrays.2.2 Kernel definitionThis section gives the syntax and semantics of the Cedar kernel language. Motivation, and anexplanation of the relation between the kernel and the current Cedar language, can be found in2.3-2.7. The kernel is subdivided into A rather austere core; everything can be desugared into this, but it isn't very readable.A set of conveniences; with these, readable programs can be written.Imperative constructs: statements and loops.Exception handling.The format of this section interleaves grammar rules which give the syntax of the language with textwhich gives the meaning. The meaning of the core is given in English. For other parts of thekernel, it is given by desugaring rules which show how to rewrite each construct in terms of others;if rewriting is done repeatedly, the result is a core program, which may invoke some primitives. Themeaning of these is also given in English. There is also some English explanation of the desugaring,but this is only a commentary and does not have the force of law.See 3.1 for the notation used in the grammar and desugaring.f2tGGN q _rqX5 ^Wrq_ \ [Orq= Y1/ XG- Uwq$ rq S 9 R.rq P L O rq? M!rqrq LNrq Jrq FrX CVw q/t Aq 0rq @N rq: >tqI =Ftq :wqD rq 8 N 7( 3wq3# 2dOrq 0tqtq , wX (q; '^ 9 % #rqD!*r q/r q"zrq OR "5 GZ .4 ?J > > *TVk(V2.2.1THE CORE72.2.1 The coreIn the core, there is syntax only for names, literals, application, l-expressions, a basic and arecursive binding construction, and syntactic type; everything else is done with primitives. We neverwrite anything in the core, however, except to show the desugaring of a kernel construct. Thus thereader need not struggle with programs in the ugly core syntax.SyntaxSyntactic typeMeaningexpression ::=n |DnENV[$n].valliteral |Dliterale1 Z e2 |(De1.RANGE)[e2]-- Standard application. |l d1=> d2 IN e |d1_d2-- Standard proc constructor. |L d1=> d2 IN e |d1_d2-- Unchecked standard proc constructor. |[(n~e), ...][(n: De), ...]-- Vanilla binding constructor. |FIX d ~ e |d-- Recursive binding constructor. |D e TYPE-- Syntactic type. type ::= eDe --gTYPE---- A type is syntactically just an expression.decl ::= eDe --gDECL---- A decl is syntactically just an expression.name ::= letter (letter | digit)...(DENV).n-- Appears as an e or in a pattern.literal ::= $ n |ATOM-- ATOM literal. |primitiveDprimitiveprimitive ::= ARROW |[d: DECL, p: (d_DECL)]_[a: --arrow--TYPE] DOMAIN | RANGE |*[a: --arrow--TYPE]_[t: TYPE]MKPAIR | [t1:: TYPE, first: t1, t2:: TYPE, rest: t2]_[v: t1Xt2]GROUP |[t1: TYPE]_[t: TYPE] --tgTYPEMKCROSS | [g: GROUP[TYPE]]_[c: --cross--TYPE] CDOTG |*[t: --cross--TYPE]_[g: GROUP[TYPE]]MKBINDD |[d: DECL, v: d.T]_[b: d] BDOTD | BDOTV |[b: BINDING]_[d: DECL] | [d:: DECL, b: d]_[v: d.T ]MKBINDP |[p: PATTERN, t:: TYPE, v: t]_[b: MKDECL[p, t]] --=MKBINDD[d~MKDECL[p, t], v~v]LOOKUP | [d:: DECL, b: d, n: ATOM]_[v: DTOB[d].n]THEN | [d1:: DECL, b1: d1, d2:: DECL, b2: d2]_[v: d1 THEND d2 ]ENV | *BINDINGMKDECL |*[p: PATTERN, t: TYPE]_[d: DECL] DDOTP |*[d: DECL]_[p: PATTERN] DDOTT |*[d: DECL]_[t: TYPE]DTOB | *[d: DECL]_[b: BINDING]--=MKBINDP[p~d.P, v~d.T.G] ]BTOD | *[b: BINDING]_[d: DECL]--=MKDECL[p~b.D.P, t~MKCROSS[b.V] ]THEND | [d1: DECL, d2: DECL]_[v: DECL] --=BTOD[DTOB[d1] THEN DTOB[d2]] BOOL | ATOM | TYPETRUE | FALSE | BOOL TYPE | DECL | BINDING |TYPE-- DECLgTYPE, BINDINGgTYPEPATTERN |TYPE--=GROUP[ATOM]ANYTYPE -- TgANY for any type TIn the kernel we dress up the primitives as follows: A primitive denoted xDOTy is in the cluster of the type of its argument under the name y. A parameter to a primitive declared with :: is the type of some other argument; theargument for this parameter may be omitted in an application of the primitive. f2t;GGq _rX \qByq [,= Y./ X$9 TrX* Qwtw PJqwzq*uqwq Nwzq MBL}MBqzqL}MBqwqzqL}MBquqL}MBq*w KTyqJ}KTqJ}KTquqwqJ}KTzqJ}*KTqw IfyqH}IfqH}IfquqwqH}IfzqH}*Ifq(w Gxqwqwqwqzqwq*w Euqwq*"w Dpzqt*q Bwtwqzqztq*. Ahwtwqzqzuq*. >=wtwq <wqwqwqzuq*# ;5w q 9wt*qtq w 8-qzq 6w q 5%uqwqxquqxqwzuqzqxq tq 3uqwquqwqxq tqzqxqtq 2uqwqx1}2qtqxqw1}2qx1}2qtqxqw1}2qzqxqw1}2zw1}2q 0/uqwqx/}0/qtqzqxqtq*xzt .Auqwqxquqtqzqxq tq ,uqwqxq tqzqxquqtq +9uqwqxquqxqwquqzqxqwq )uqwquqwqxquqzqxquqwqxquqxqwqzqxqwquq (1uqwqxquqxqtqxqwqzqxquqwqwquqxquqwqwqxqxq &uqwqxquqxqwqxqtqzqxquqwqwq %)uqwqx$}%)quqx$}%)qw$}%)qx$}%)quqx$}%)qw$}%)qzqxqw$}%)wuqw$}%)q #;uqwqu !qwqxquqxqtqzqxquq  3uqwqxquqzqxquq uqwqxquqzqxqtq +uqwqqxquqzqxquq*uqxqxquqxqxuquq uqwqqxquqzqxquq*uqxqxquqxquqxquq #uqwqx}#quqx}#quqzqxquququqx}#qwtquqx}#q 5tqwqtqwqt qwqtqwqtq -tqwuqwquGwt*qXuztquzt uqwt*quqtq %ttq*xztq x q3 xuxq>xq J"0F fTVk(KERNEL DEFINITION2.28A name not in a literal (or pattern, in the kernel) denotes the value to which it is bound in thecurrent environment ENV (A below). An ATOM literal is a value which stands for a name in theprimitives which deal with declarations and bindings. A literal denotes a value according to a rule which depends on its syntax. The core has onlynumeric and ATOM literals, and the primitives enumerated above.An expression denotes a value according to a rule which depends on its syntax. If the expression isa name or literal, the value is the value of the name or literal. The remaining cases are discussed inthe following sub-sections. Most of these cases define the value of the expression in terms of thevalue of its sub-expressions. The sub-expressions may be evaluated in any order. A. The current environment ENVThe current environment ENV is a binding. The value of the expression n is ENV.n. ENV for a sub-expression is the same as ENV for its containing expression, except that:For the b of a closure being applied, ENV is computed according to B below.For the e of a FIX, ENV is computed according to E below.Thus applying a closure and evaluating a FIX are the only ways to change ENV. B. ApplicationThe value of a standard application is obtained by evaluating e1 and e2 to obtain v1 and v2, andapplying v1 to v2. There are two cases for application:v1 is a primitive. The value of the application is a function of v2 given in the definition ofthe primitive. v1 is a closure c (C below), with domain declaration d, body b and environment E. Thevalue of the application is the value of the expression b in the environment MKBINDD[d, v2]THEN E (D below). Note that if the closure was made with L, the body must be type-checked when it is applied; a closure made with l was type-checked when it was made (Cbelow).An application type-checks if De2 implies De1.DOMAIN (G below).C. LambdaThe value of a l-expression is a closure, which has three parts:A domain declaration d, equal to the value of d1.A body b, which is the expression e (not the value of e).An environment E, equal to the current environment ENV.A l-expression type-checks if d1 evaluates to a declaration d.For any x of type d.T, De implies d2.T in the environment MKBINDD[d, x] THEN E.A L-expression type-checks if d1 evaluates to a declaration; type-checking of the body is deferreduntil the closure is applied.f2tG F{ q _S ^W uqtq& \ , Y'4 X$tq/ T%< Su] Q=" PmL LnrX| ICquq+xquqxquq G uq,Egxquq"Cxququq" @)uquq . Bigger structures aremade, as in Lisp, by making pairs of pairs. When we are interested in the leaves of such a structure,we call it a group and call the leaves its elements. A group has type GROUP[T] if all its elementshave type T or are NIL. A flat group is a pair in which first is not a group, and rest is a flat group orNIL. The type of a pair is a cross type: MKPAIR[x, y] has type TXU iff x has type T and y has type U.Cross types are made with MKCROSS, which turns a GROUP[TYPE] into a cross type in the obviousway:MKCROSS[NIL]=???MKCROSS[T]=T if T is a type.MKCROSS[ MKPAIR[x, y] ]=MKCROSS[x]XMKCROSS[y]Note that MKCROSS of a flat group is flat. CDOTG goes the other way, turning a cross type into aGROUP[TYPE] in which no element is a cross type. Thus MKCROSS is the inverse of CDOTG, but notnecessarily the other way around.E. BindingsA binding is either NIL, or an tuple, or a tuple. The primitiveMKBINDD constructs a binding from a declaration d and a matching value v, i.e. (as the type ofMKBINDD indicates), one with the type d.T. The resulting binding has type d, and consists of thenames from d paired with the corresponding values from v. Example:MKBINDD[ [x: INT, b: BOOL], [3, TRUE] ] = [x~3, b~TRUE]= < <$x, 3>, < <$b, TRUE>, NIL > >In this example, d.T is INTXBOOL. The declaration and group in this example is written using the syntax of 2.2.2; in the core they would beMKDECL[p~[$x, $b], t~MKCROSS[[INT, BOOL]] ] and MKPAIR[first~3, rest~MKPAIR[first~TRUE, rest~NIL]], where we have writtenthe arguments of these primitives in the kernel syntax.The primitives BTOD and BTOV return the arguments of the MKBINDD primitive that made thebinding. MKBINDP is redundant; it is like MKBINDD, but takes a type instead of a declaration, andhence accepts any v with the right structure.LOOKUP returns the value of the name n in the binding. THEN combines two bindings, givingpriority to the first one in case of duplicate names. It works only for flat bindings, in which the firstelement of each tuple is an tuple, and the second element isanother tuple or NIL. The value of b1 THEN b2 is another flat binding, obtainedby first replacing any tuple <, b> in b2 where a is equal to an atom in b1 by b, and then usingthis binding to replace the final NIL in b1. The binding constructor [(n~e), ...] has the value MKBINDP[p~[n, ...], v~[e, ...] ]. FIX makes a recursive binding: the value of FIX d1~e is MKBINDD[d, v], where d is the value of d1 inENV and v is the value of e in the environment (LET FIX d~e IN d~e) THEN ENV. Of course in generalthis computation may not terminate; normally the names in d occur in e only with l-expressions,and in this case it does terminate. Is this really right? The FIX typechecks if De in the latterenvironment implies DTOT[d]. f2t;GGq _rX \q*uqxqxqxqxq [,N Y rqrquqxq X$xqtqrq xqxq Vtq Su!uqxqxq xzxqxq xqxqxq Ququqtq PmNuqtqKuqxqxqxq Ieuquqxqxquqxqzuqxq G uquq$ Euqtq*uquq D  @rX tuple, or a tuple. Theprimitive MKDECL constructs a decl from a pattern p and a value t of type GROUP[TYPE]. A pattern isa GROUP[ATOM], i.e., either NIL, or an atom, or a pair of patterns; the ATOM elements must all bedifferent. An application of MKDECL typechecks if t matches p, i.e., ifboth p and t are NIL, orp is an atom and t has type TYPE, orp is a pair [p1, p2] and t is a cross type t1Xt2 and p1 matches t1 and p2 matches t2.The resulting declaration consists of the names from p paired with matching type values from t. The primitives DDOTP and DDOTT return the arguments of the MKDECL primitive that made thedeclaration. ThusDDOTT[NIL]=???; DDOTT[<$n, T>]=T; DDOTT[]=DDOTT[d1]XDDOTT[d2]DTOB is redundant; it converts a declaration to a binding in which each name has the correspondingtype as its value. Thus DTOB[[x: INT, y: REAL]]=[x~INT, y~REAL]. The inverse is BTOD, alsoredundant; it is defined only if all the values in the binding are types. THEND combines twodeclarations just as THEN combines two bindings: D(b1 THEN b2)=Db1 THEND Db2G. Types and type-checkingA type is a value consisting of a pair:the predicate, a function from values to BOOL.the cluster, a binding.A value v has type T if T's predicate applied to v is TRUE. T implies U iff (Ax) T.Predicate[x]gU.Predicate[x].Typechecking consists of ensuring that the argument of an application has the type specified by thedomain of the proc (B above). The body of a l-expression can then be type-checked (or theimplementation of a primitive constructed) independently, assuming that the parameter satisfies thedomain predicate. Symmetrically, the result of an application can be assumed to have the typespecified by the range of the proc.To complete the induction, it is also necessary to check that the value of the body of a l-expressionhas the range type (C above).The primitive types in the kernel are:BOOL, with two values TRUE and FALSE.ATOM, with values denoted by literals of the form$n.TYPE, a predicate satisfied by any type value.ANY, a precidate satisfied by any value.DECL, the type of a declaration (F above).BINDING, the type of any binding.f2tG F{ q _rX \qtq2 [,uq"xqxquqtq Yuqtqtq)tq  X$ uqxqxq UxqxqtqStxqxq tqQxq xP}QqxP}QqxqxP}QzxP}QqxP}QqxP}QqxP}QqxP}Qq N5xq'xq Ll uququq J HuqtqF8uqxqxqxCuqxCS}CqxCS}CquqxCS}CqzuqxCS}Cq @uq(6 ?1uqxqtqxqtqxqtqxqtq uq = 4 uq <) uqzqx;}<)quqx;}<)qzx;}<)quqzx;} 8*rX 4q&2rqtq0Orq -xqrqxqxqxqtq *xqrxqzxqxqxqxqzxqxqxq ' W &&yq, $ Q #R ! f Jyq  &_tqtqtqtq.xqtq*Wtq% uq& uqTVk(2.2.1THE CORE11Arrow types, the types of procs (C above). An arrow type has a domain type and a rangetype.Cross types, the types of pairs (D above). GROUP[T], the type of any pair in which all the elements have type T.Declarations, the types of bindings (E and F above). There are no non-trivial implications among any of these types, except as follows:DECLgTYPE; BINDINGgTYPE; GROUP[T]gTYPE.ANYgT for any type T.T1XT2gU1XU2 iff T1gU1 and T2gU2.GROUP[T]gGROUP[U] iff TgU.T1_T2gU1_U2 iff U1gT1 and (Ax: U1) (l T1 IN T2)[x]g(l U1 IN U2)[x]. Note thereversal of the domains.d1gd2 for declarations iff d1.P=d2.P and DTOB[d1].ngDTOB[d2].n for each n in d1.P. f2t;GG?q_:rq r^Wq[&YuqxqDECL IN d2] l ( | e1) ( | => e2) IN e3 |-- The domain defaults to [], the range to x: De3 |LET e1 IN e2 |( l De1 IN e2 ) Z e1.V -- e1 a binding |LET b, ... IN e |LET [b, ...] IN e|IF e1 THEN e2 ELSE e3 |( IFPROC[De2, e1, l IN e2, l IN e3] ) [] |e . n |IF DegBINDING THEN LOOKUP Z [De, $n] ELSE IF DegTYPE THEN LOOKUP Z [De.cluster, $n] ELSE ( LOOKUPC Z [De.cluster, $n] ) Z [e] |e1 [ b, ... ] | e1 [ e2, ... ] |e1 . APPLY Z [b, ...] | e1 . APPLY Z MKBINDD[De1.DOMAIN, [e2, ...] |e1 infixOp e2 |e1 . infixOp[e2] |e1 AND e2 | e1 OR e2 |IF e1 THEN e2 ELSE FALSE | IF e1 THEN TRUE ELSE e2 |[ ] | [ e1 ( | e2, !.. )] |NIL | MKPAIR[e1, [ ( | e2, !.. ) ]-- Group constructor. |PATT p |-- Pattern constructor; see the rule for p below. |[ b, ... ] |b PLUS ... PLUS NIL |REC [ (p : t ~ e), ... ] |FIX [p, ...] : MKCROSS[[t, ...]]~[e, ...] |[ d, ... ] |d PLUS ... PLUS NIL |XX |xxxxxx | --Also recursive d maps into this?statements | simpleLoop | -- See 2.2.3but-- See 2.2.4.infixOp ::=XMKCROSSPLUSTHENliteral ::= coreLiteral |digit digit ... |INT-- Numeric literal, giving the decimal representation.declaration ::=-- A d is not an e; a d must be before ~ or after LET or DECL.p: t |MKDECL[ PATT p, t] |[(p: t), ... ][p, ...]: MKCROSS[[t , ...]]-- to separate names and typesbinding ::=-- Only the [...] form is an e; a b must be written after LET. |p ~ e |MKBINDP[PATT p, e] |d ~ e |MKBINDD[d, e] pattern ::= -- Note: a pattern is not an e; it can appear only before ~ or :, or after PATT in the kernel.n |-- PATT n=$n [p1, ...] -- PATT [p1, ...]=[PATT p1, ...]primitive ::= corePrimitive |LOOKUP | LOOKUPC |-- Fill in typesPLUS |IFPROC |The precedence of operators in e is: (highest) [], Z, infixOps (all the same), BUT, IN (lowest). All areleft associative.2.2.2.1 Expression syntaxMost of this is straightforward sugar. LET adds the binding e1 to ENV in evaluating e2. The separatecase for b, ... simply allows the [] which normally enclose a binding constructor to be omitted in thiscase; see 2.2.2.2. IF wraps e2 and e3 in l's so that they don't get evaluated; the IFPROC primitivechooses the one to evaluate and applies it.The dot notation has three cases. f2tG F{ q _rX \wtwqw [,qZ}[,qzqZ}[,qw*uqzqZ}[,qyqZ}[,ququqZ}[,q Y>yqwqX}Y>wqX}Y>wquqX}Y>qw*q+xqzqX}Y>qw WPuqV}WPquqV}WPqw*q*Wy*WPyqzwV}WPqtqwV}WPq1Wy2eWPzqwV}WPquqV}WPq w Ubuqwquqw*uqwtqwquqw2Mw StqSQ}SqtqSQ}SqtqSQ}Sqw*q*Sy*SuqzwSQ}SqwSQ}SqyquqwSQ}SqyquqwSQ}Sq=Sy>Sw Qqw*tqzqzuqtquqzqzwqwq*PltGzqztqXtquqzqzwqxqwq*NtGqX-qNy-Nuqzqzwqxqwq>Ny>yNzqwqw MdqL}MdqwqwqL}MdqL}Mdqwqw*wL}MdquqzqwqwqwqwL}MdquqzquqzwL}MdquqwL}Mdqwqw KvqJ}Kvq J}Kvqw*wJ}Kvqw,/K>z0KvqwJ}Kvqw IqH}IqtqH}IqwqH}IqtqH}Iqw*tqH}IqtqH}IqtGqXwqtqH}IqtqtqtGqH}IqXw GqwqG }GqwqwqG }Gqwqwqw*tqwquqwG }GqwqwqwG }Gqwqwq:w Euqw*q2w D(q w*wquqwquqtqw BuGqXwqwqwqw*uqwquqwqwq w A q w*wquqwquqtqw ?zqw*qwq# > wq wq* <* ;w 9z*u 8 6 5w q w 3|q wqw*t2Mq6 1wt w*q2uquq 0tw*uquqwqwqw .qwqwqwq*wqwquqwqwq: -lwtw*q wq*uqw +qw*uquqwqwqw *dqw*uqw/*,0>*dqwq (wqwq*rq(*'\uq %w*quqwqwq $T#}$Tqwq*uqw#}$Tqwquqw#}$Twqwq "fwqwqw  uqwquqw*q ^uqw uqw q+zququq + ,rX q#uqxt}quq xt}q xq3* tqx}qx}qyq)uq $ v"TVk(-2.2.2CONVENIENCES13For a binding it just looks up n in the binding. For a type it looks up n in the type's cluster.For anything else, it looks up n in the cluster of De and applies the result to e. The specialLOOPUPC primitive does something special if it finds a proc which takes more than oneargument: it splits the proc into one which takes the first argument and returns a proctaking the remaining arguments. This ensures that if De.n is such a proc P, the expressione.n[a, b] will desugar into something equivalent to P[e, a, b]. The usual syntax for application is a proc e1 followed by an explicit binding constructor. The kindof application may depend on the type of e1, via the APPLY element of its type; for a proc appliedby the standard apply operator Z, APPLY is the identity. If e1 is followed by an group rather than abinding constructor, the argument is obtained by binding the group to the declaration which is e1'sdomain.Infix operators desugar straightforwardly into application; note that the choice of proc is determinedby the type of the first operand only. AND and OR are not ordinary infix operators, since theyevaluate no more than necessary; this is expressed by the desugaring into IF.The remaining expression syntax is various constructors, described in the next section, andimperative and exception features described in later sections.2.2.2.2 Group, binding and declaration constructorsA bracketted sequence of expressions (e.g., [1, 2, 3]) denotes a flat group with its elements in thesame order (e.g., MKPAIR[1, MKPAIR[2, MKPAIR[3, NIL]]]. Thus a group constructor is just like theLIST function in Lisp. A pattern is a similar construct, except that it contains names which standfor the corresponding ATOM literals; PATT yields the group obtained by replacing each name n bythe literal $n. After desugaring a pattern always appears after PATT and hence is always desugaredinto an atom or a GROUP[ATOM].Brackets are also used to delimit binding and declaration constructors. They are distinguished fromeach other, and from group constructors, by the presence of ~ in each element of a bindingconstructor, and : in each element of a declaration constructor. The elements of a binding ordeclaration constructor are sugar for applications of the MKDECL, MKBINDP and MKBINDD primitives.The constructor itself strings the resulting declarations into a big one using the PLUS operator,which is just like THEN except that it does not allow duplicate atoms; the motivation for this is toallow the names and corresponding types or values to be written together, instead of factored as theprimitives require. As a result, values made from constructors are always flat.Note that these constructors do not nest, except that a d can be [(p: t), ... ]. This is intended for thed~e form of binding; e.g., if DivRem returns two INTs, you can write [d: INT, r, INT]~DivRem[...]instead of [d, r]: INTXINT~DivRem[...].The REC binding constructor is sugar for FIX which exactly parallels the non-recursive one.2.2.3 ImperativesThese constructs are generally used together with non-functional procs.statements ::= { e; ... }IF (ISVOID[e]) AND ... THEN [] ELSE ERROR -- Ordering by non-prompt evaluation.simpleLoop ::= SIMPLELOOP statementsLET REC [loop(~(l IN { statements; loop([] } ) ] IN loop([]-- Only an exception (such as EXIT) will terminate the loop. f2t8 G?q_xq]xq[+xqzxqxqYuqJX#",V/zxqxqUxqxq,xqxq Q(xQc}Qq6 P&xOu}Pq uq( Nzquq xM}Nq& L&BxK}L&q J8 G a E tqtq- DBtq @B ?V 4 ;WrX, 8,q12 6 uququqtq. 5$C 3tq uq' xq 2 xq2uq  0 uqtq -mH +7 *e 8 ( uququq ']6uq %uqM $U Q " E >wxqxwqwq "xqxqxq tqxqtqxqtqxq xqxqtztqxq suq"uq/ trX IqB w qXwqwq*tqwuqwqwqtqwqtqtqtq* % w qu q *uquqzq3} y3 yquqw qzqDO yD uqzq* tq ,TVk(KERNEL DEFINITION2.214Each e in the statements must evaluate to VOID; this is to catch mistakes like writing x+1 as astatement. The definition of AND ensures that the e's are evaluated left-to-right.The simpleLoop is the standard way to express a loop in terms of recursion. You are supposed touse an exception to get out of this loop; Cedar provides a number of convenient ways to do this,such as EXIT and RETURN.2.2.4 ExceptionsAn exception is treated as a special value returned from an application. The exception valuecontains an exception code and an args value which may be of any type. When an application seesan exception value, it immediately abandons the application and returns the exception value; thusapplication is strict. There has to be some way to stop this, or the first exception would be the valueof the program. The HIDE primitive takes any value and returns a variant record of type HEX. Itturns:a normal value into the normal variant, with the value in its v field;an exception into the exception variant, with the code in its code field and the arguments inits args field.UNHIDE takes a HEX value and returns the original unhidden value.An exception code has the type EXCEPTION[T], where T is a declaration which is the type of theargs; it is the domain of the exception, and (DEXCEPTION[T]).DOMAIN=T. An exception value isconstructed by the primitiveRAISE: [T:: TYPE, code: EXCEPTION[T], args: T]Thus the args always has the type demanded by the code.This is dressed up with the following syntax. but ::= e BUT { butChoice; ... }LET v(~HIDE[e] IN ( IF ISTYPE[v(, HEX.normal] THEN UNHIDE[v(] ELSE IF ISTYPE[v(, HEX.exception] THEN LET h(~NARROW[v(, HEX.exception ] IN LET selector(~h(.code IN butChoice ELSE ... ELSE UNHIDE[v(] ELSE ERROR )butChoice ::= e1 => e2 |IF selector(=e1 THEN LET MKBINDD[De1.DOMAIN, h(.args] IN e2 |e1 , e1, !.. => e2 |IF (selector(=e1) OR ... THEN e2 |ANY => e2 IF TRUE THEN e2A BUT expression evaluates e. If it is a normal value, that is the value of the BUT. If it is anexception, each butChoice in turn gets a look at it. If one of them likes it, then it supplies the valueof the BUT; otherwise the exception is the value.The e1 in a butChoice must evaluate to an exception code. If there is just one, and it matches codein the exception, then args in the exception is bound to the domain of the code, and e2 is evaluatedin that environment. If there is more than one, then e2 is just evaluated in the current environment.An ANY butChoice matches any exception, but of course doesn't bind the arguments.f2tG F{ q _xq$uq(xq ^W tqxq [,,0 Y P X$tqtq T%rX Pq M Ov rqxq9 MI Ln rq G Juq@uq IfGxqxqDxq xqC2xq @uquq/ >uqxqxq* < rqzuqxquqxq ;z  9uqXxqtqxquqxqxqxq 8rxq%xq 5G. 2wXquq wq*uqzquqwquq41y*0tqtqzquqtquqzq*/tG qzqXuq t*-quqzqtqzquq uG*, qXzqzquqw:+@, qtqwqtquqzq**qtqtq2*Py )w q(w})q(w})qw*tqzqw(w})qtququqzq(w})quqzquqw(w})qw 'q&}'q&}'qwq&}'qw*tqwqzqw&}'wqtqwqtqw&}'qw %(tGqX$}%(q*tqtqtqw$} !quqxq4uq y H uq' x=}qDx qxq9xO}q #xa}q. tqK TVk(@2.3DOING COMPUTATIONS152.3 Doing computationsThis section describes the basic mechanisms for doing computations, and the kinds of values whichcan be manipulated by Cedar programs.2.3.1 ApplicationThe basic mechanism for computing in Cedar is applying a proc to argument values. A proc is amapping from argument values and the state of the computation, to result values, and a new state of the computation.The state is the values of all the variables.A proc is implemented in one of two ways:By a primitive supplied as part of the language (whose inner workings are not open toinspection).By a closure, which is the value of a l-expression whose body in turn consists of anexpression, which may contain further applications of procs to arguments, e.g., l [x: INT] INx+3. When a closure is applied, the parameters declared after the l are bound to thearguments, and then the body after IN is evaluated in the new environment thus obtained.In Cedar, each parameter value thus obtained is used to initialize a variable, which is the objectnamed by the parameter in the body. Thus the body can assign to the parameters. Use of thisfeature is not recommended.Note that when a l-expression is evaluated to obtain a closure its body is not evaluated, but issaved in the closure, to be evaluated when the closure is applied. Some constructs (IF, SELECT, AND,OR) are defined (see 2.2.2 and 3.8) by wrapping l-expressions around some arguments, and thenapplying them only when certain conditions hold; e.g., IF b THEN f[x] ELSE g[y] evaluates f[x] iff bis TRUE and g[y] iff b is FALSE.Application is denoted in programs by expressions of the form f[arg, arg, ...]. If the value of f is aclosure, this expression is evaluated by evaluating f and all the arg's, and then evaluating the bodyof the closure with the formal parameters bound to the arguments (unless an exception value turnsup; see 2.6.2). Thus to evaluate (l [x: INT] IN x+3)[4]:evaluate the l-expression to obtain a closure;evaluate the argument 4 to obtain the number 4;evaluate x+3 with x bound to 4 to obtain the number 7.The first two evaluations can be done in either order.To evaluate a primitive application such as x+3, evaluate the arguments, and then invoke theprimitive on those arguments to obtain the result and any state change. With a few exceptions (e.g.,assignment and dereferencing or following references), primitives are functions and can be thoughtof as tables which enumerate a result value for each possible combination of arguments. Invoking aprimitive can therefore be viewed as a simple table lookup using the arguments as the table index.Actually there may be one more step in an application. If an argument doesn't have the typeexpected by the proc, the argument is coerced to the proc's domain type if possible. If no coercioncan be found, there is a type error. Coercion is discussed further in 2.6.1 and 4.13. Most procs take a binding as argument, in which the various parts of the argument are named. E.g.,OpenFile: PROC[name: ROPE, mode: Files.Mode] takes a binding with two values named name and mode.It might be applied like this: P[name~"Budget.memo", mode~$read]. If the names are missing, there f2t4G G?q _wX \qM [," W-rX Tq'3 R~P&rqrqMrq, Kv- I)Frq7EB Brqyq-Af +yqxqtqt?xqr q yq>^ rquq3 <D :'/ 8 5 yq9rq 4O4tqtqtq 2tq1yq, 1G/tqxqtqxqxqtqxqxq xqxqx /qtqxqxqxqtq , 0xqxqxqxq +,xq xq )_ ( !yqxqtqtqxq% yq#\/!xqxq# 6 )xq/ X y X #= qY F8 rq3 >W X xqtqxqtqxqx q(xqxq xqxqxqxq TVk(THE KERNEL LANGUAGE2.316is a positional coercion which supplies them left-to-right, see 2.3.6. There is also a defaultingcoercion that supplies missing parts of the binding; see 4.11. If f is neither a primitive nor a closure, the meaning of applying it is defined by the APPLY proc forits type; this case is discussed further in 4.4.There are many ways of writing applications other than f[x]. In fact, many Cedar primitives cannotbe the values of expressions, and can only be applied by writing some other construct. Thedesugaring rules show how large parts of the Cedar syntax denote various special kinds ofapplication. In each case, the meaning is defined by the standard meaning of application and thespecific meaning of the primitives involved; see 4.1.This is partly because of history, and partly because specialized syntax makes the program more readable. Futureevolution of the language will improve the situation. Functions and order of evaluationAn expression is functional ifits value does not depend on the state, but only on the values bound to its free variables,andevaluating it does not change the state.As a consequence of this definition,Two identical functional expressions in the same scope will always have the same value. A proc is a function if every application of it is functional. It doesn't matter when or how manytimes a function is applied; the order of evaluation doesn't matter for functions. Thus Cedarfunctions can be thought of as mathematical functions for many purposes. Note that a constant canbe regarded as an application of a function of no arguments.Non-functional procs, on the other hand, are more complicated objects. Cedar makes no formaldistinction, either in syntax or in the type system, between functions and procs. However, it doesnot define the order of evaluation in an expression, except that: all arguments are evaluated before a proc is applied;because of the desugaring of IF, SELECT, AND and OR into l-expression, the order ofevaluation for these expressions is determined by the first rule;statements separated by semi-colons are evaluated in the order they are written.As a consequence, two applications of non-functions should not be written in the same statementunless they don't affect each other; if this is done the effect of the program is unpredictable. An expression is guaranteed to be functional if it only applies functions; thus if f is a function, p anon-functional proc, and x a variable, f[3] is functional and p[3] and p[x] may not be. Furthermore,f[x] may not be functional, because it is sugar for f[x.VALUEOF], and VALUEOF is not a function. Thevalue of a l-expression is a function if its body is functional. There are more complicated ways ofguaranteeing that an expression is functional, just as for any other interesting property.Because the values of variables constitute the state, it is only the existence of variables that allowsnon-functional procs to exist. In particular, the VALUEOF proc which returns the value of a variableis non-functional (because its result depends on the state), and the ASSIGN proc which changes thevalue of a variable is non-functional (because it changes the state). f2tGF{ q _r qHr ^Wq9 [,xqDtq Y/ V}2xqxq  T M Su 4 Q = Pm/ Mt b LJ- HKrX E qr qB$4AD>  <$: ,R?)5'tqtqtqtqyq& 7#P !n] [ *'xqxq ; xq xqxqxqxq xqxq0xqxququq 3yqG  N H  $uq+ |+uq A TVk(a2.3DOING COMPUTATIONS172.3.2 ValuesA Cedar program manipulates values. Anything which can be denoted by a name or expression inthe program is a value. Thus numbers, arrays, variables, procedures, interfaces, and types are allvalues. In the kernel language, all values are treated uniformly, in the sense that each can be:passed as an argument, bound to a name, or returned as a result. These operations must work on all values so that application can be used as the basis forcomputation and l-expressions as the basis for program structure. In addition, each particular kindor type of value has its own primitive operations. Some of these (like assignment and equality) aredefined for most types. Others (like addition or subscripting) exist only for certain specific types(numbers or arrays). None of these operations, however, is fundamental to the language. Formally,assignment or equality has the same status as any operation on an abstract type supplied by itsimplementor; thus INTEGER.ASSIGN has the same status as IO.GetInt. In practice, of course, specialsyntax is usually used to invoke these operations, and the implementations are not Cedar programsopen to inspection by the editor or debugger. A complete description of the primitives supplied bythe language can be found in Chapter 4, organized by the type of the main operand. Table 45 isan alphabetized index of these descriptions.Restrictions: In current Cedar, however, there are restrictions on values which are types, declarationsor bindings: they can only be arguments or results of modules, and hence are first-class values onlyin the modelling language, and not within a module. Also, declarations and bindings cannot beconstructed or bound to identifiers within a module. Unions are also restricted: they can onlyappear inside records. Nonetheless, it is simplest to emphasize the uniform treatment of all values,and consider separately the restrictions on types, declarations, bindings and unions. Future evolutionwill improve this situation.Restriction: In current Cedar you can only use dot notation for some operations of built-in clusters:the procs which access record fields, and others as noted in Table 45. As a substitute, there arevarious syntactic forms which are sugar for dot notation: infix, prefix and postfix operators, built-infunctions, and funny applications. These desugarings are given in rules 20-24 of the Cedargrammar in 3.2.3.3 VariablesCertain values, called variables, can contain other values. A variable containing a value of type Thas type VAR T. If the variable doesn't allow the value to be changed, the type is READONLY T; thisis not the same as T, because there may be a VAR T value which is the same container. The valuecontained by a variable (usually called the value of the variable) can be changed by assigning a newvalue to the variable. The set of all variables accessible from the process array constitutes the stateof the computation; these are all the variables which can be reached from any process, and avariable which cannot be reached cannot affect the computation. Note that a variable value is acontainer, which like all values is immutable; it may help to think of it as (the address of) a blockof storage. The contents of a variable can be changed by assignment. Thus the value of a variablecan change, even though the value that is the variable is immutable.A suitable abstract representation for a VAR T is a value of type [Get: []_T, Set: T_[]]. Thisrepresentation is not used in Cedar, but it clarifies the way in which variables fit into the typesystem: VAR TgVAR U only if T and U have the same predicate, because the Get proc requiresTgU and the Set proc requires UgT. READONLY T corresponds to [Get: []_T] and a write-onlyvariable type would be [Set: T_[]].There is a coercion (an automatically applied conversion; see 2.6.1) from VAR T to T, so that avariable can be passed without fuss as an argument to a proc which expects a value. f2t4G G?q _rX \qU [, R YYWPTR OuP M yq5 Lmrq\ JF IeY G U F] tquqxq! DH CU[ A!; @M* ="r q? ;J :1* 8 ; 7W 5K 4  0r q3& /[9& -F ,S 9 * &rX #qAx "!quqxq)tqxq xq uvxq- )rq .r q #7 ?  rq?  rq U?'tqtq RL Puq/xqrq O xqxqxquq MK0xqxqtqxq xqxqxq Hrq xqxquqxq xqxqxq GT>xqxq E& BL >rX ;{q"4 99r qrq 8sxq;xq 6:! 5kr q2 3R 2cxqxqtqxqtqzqxqxqxqtqxqtq 0xqxqxq -r qxqxqxq xq ,0xq xzxqzqx+},0qx+},0qzx+},0zx+},0zqzq *B Sx){ (Tq#uqx'}(Tqx'}(Tq x'}(Tzx'}(Tzq &f" #;r q/' !-tqtqtqzqtqtq 3C ] + &zq uqW rX }qY ))xq tqxqtqxqtqxqxqtq xqxqxqtq,rq  uq9uqTVk(Y2.3DOING COMPUTATIONS19LET i~3, b~TRUE IN (IF b THEN i+5 ELSE 0)has the value 8. Current Cedar doesn't have LET expressions, but a binding at the beginningof a block has the same effect. See 2.5.4 on scopes for details.As a way of collecting and naming a set of related values. A value can be extracted fromthe set using dot notation. Thus if b is the binding [i~3, b~TRUE], the value of b.i is 3. Incurrent Cedar this only works for interfaces; see 3.3.4 and 4.14 for details.A binding is usually denoted by a constructor, which takes the form[i~3, b~TRUE]or redundantly (if there are no coercions)[i: INT~3, b: BOOL~TRUE]in which the types are specified explicitly (but you can't write the second form as the argument ofan application). See 2.5.5 on constructors for details.2.3.6 ArgumentsWhen a group or binding is bound to a declaration (d~v), there are various conversions calledcoercions which may be applied to the values. This usually happens when the arguments of a procapplication are bound to the parameter declaration.First, if v is a group rather than a binding, it is coerced to a binding by attaching the names from dto the elements of v in order. Thus in[a: INT, b: REAL]~[2, 3.14]the group constructor is coerced to [a~2, b~3.14].Next, if v is shorter than d, elements of the form n~OMITTED are appended, where n is thecorresponding name from the declaration. Thus in[a: INT, b: REAL]~[2]the group constructor is coerced to [a~2, b~OMITTED].Now the items of the binding are matched by name with the items of the declaration. There is anerror unless the names match exactly. The remaining coercions are done on individual items, n: tfrom the declaration and the corresponding n~v from the binding. If v has type t, all is well.Otherwise, if there is a sequence of coercions from the type of v to t, these are applied to v. If nosuch sequence exists, there is an error. In particular, there is a coercion from OMITTED to the defaultvalue for t, if any. Thus in[a: INT_0, b: REAL_1.1]~[b~3.14]the group constructor is coerced to [a~0, b~3.14], and in[a: INT_0, b: REAL_1.1]~[]it is coerced to [a~0, b~1.1]. Coercions are discussed in 2.6.1 and 4.13, defaulting in 4.14.An important special case is constructors for record and array values. A record type has aconstruction proc; e.g., R: TYPE=RECORD[a: INT, b: REAL_0.]has a proc R.CONS of type PROC[a: INT, b: REAL_0.]_[R]. Thus R.CONS[a~2, b~3.1416] constructs arecord value. There is also a coercion from BINDING to the particular binding type RB which is thedomain type of R.CONS, so that r1: R_[a~2, b~3.1416]is short forr1: R_R.CONS[a~2, b~3.1416].Composing the positional coercion from GROUP to RB with R.CONS makes r1: R_[2, 3.1416]also short for the previous line.The same scheme works for arrays, but only an array indexed by an enumeration has acorresponding binding which can be written; the elements of an array indexed by numbers don'thave names which can be written in a binding. However, the group constructor still works. f2t4G G?q_uqXxqxqtquqtqxqtqxqtq^W)uq,\@Z{@X!xqxqxqtqxqxqWsJ THC RxqXxqtq Q@( OxqXtqxqtqtq N8C L7 HrX Eq/xqxq% Drq> B ( ?WxqZx =qxq )rq%<C;z:9 K 7>5FY3 ;2> rq /?.b?rq , 7% *xq" )xqXtqx '~qxq xqxqxqxq*xq %@ $v A " R !n Crqxq$xqxq  xq/ ; r q:tq rX" q5 b I F  ZxqXtqtqtqtq  xqtqxqxq  Rxqxqxqxq LTVk(2.4THE TYPE SYSTEM21there must be a check that i>0 and i<10 just before the expression a[i] is evaluated. This is calleda bounds check; if it fails there is an exception called Runtime.BoundsFault. Where did this checkcome from? Note that a[i] is short for Da.APPLY[a, i], and Da.APPLY is SUBSCRIPT, the subscriptprocedure for ARRAY [0..10] OF INT. The type of SUBSCRIPT is PROC[array: ARRAY [0..10] OF INT,index: [0..10]]_[VAR INT]. So when i is passed as the index argument, the declaration of SUBSCRIPTsays it must have the type [0..10]. The predicate for this type is l [x: ANY] IN HASMARK[x, INT] AND LET y: INT~x IN y>=0 AND y<=10.Leaving the HASMARK term for later discussion, we see that the rest of the predicate is the same asthe bounds check.The type system is designed, however, so that most assertions can be checked statically (i.e., proved),by examining the text of the program without running it. Static checking has three obviousadvantages:It reports any errors after a single examination of the programming, leaving none (of thiskind) to be discovered later in Peoria.It introduce no cost in time or space for run-time checking.The compiler can take advantage of the assertions to generate better code.Of course, there is a corresponding drawback: the assertions made by parameter declarations mustbe simple enough that the compiler can reliably prove or disprove them. The proofs done for typechecking have exactly the same form as program correctness proofs basedon preconditions and postconditions. Consider a proc whose value is the l-expression l [x: T]=>[y: U] IN e. The domain declaration [x: T] is a precondition for the body e. This means that any application ofthe proc must satisfy this condition. As a consequence, the body e can be analysed on theassumption that the precondition holds, i.e., that x has type T. Similarly, the range declaration [y: U]is a postcondition for the body. This means that given the precondition, any evaluation of e mustproduce a value y which has type U. In summary, for the body we assume the precondition andmust establish the postcondition.To make this hang together, each application must establish the precondition; this means that theargument must have the domain type. In return, the application can assume the postcondition; thismeans that the result of the application has the range type. Thus we have a linkage:argumentgdomaingrangegresultThe result in turn will be the argument of another application. In this way the proof is extended tolarger and larger expressions, and finally to the whole program. In summary:Applicationestablish pre-condition: arguments have the domain type;rely on post-condition: results have the range type.Bodyrely on pre-condition: parameters have the domain type;establish post-condition: returns have the range type.These proofs require showing that an expression always has a particular type T. This is done byobserving that every expression has a unique syntactic type U, which is the type of every evaluationof that expression; e.g., an application always has the range type of its proc (see below for a moredetailed discussion of syntactic type). If every value of type U has type T, we are done. Hence theusefulness of type implication. One type implies another, TgU, iff (Ax) T[x]gU[x]. If two typesare equal, each implies the other. However, there are many other useful cases of implication. Forinstance, VAR INT implies READONLY INT. The type implications in current Cedar are given in4.12.Of course, not all arguments are applications. The kernel grammar gives the other possible forms ofargument expressions, and we enumerate the proof rules for each:A literal is like a zero-argument proc: it has a known range (e.g., 3 has type INT, 'A hastype CHAR). f2t7UG G?q _xzqxzqxqxq ^Wr qxq \xqxqzxqtqxqxqzxqtquq [Otqtqtq uqtqxqtqtqtq Yxqzquvtq xqxqu XGq? VyqXxqtqtquqxqtqtquqxqtqxquqxqtqxq U?uq2 S PIr qrq O J M K0EI"GT<DJ B// A F = P [y: U] IN e has the type [x: T]_[y: U]. This works for the reasondiscussed in the next paragraph.A binding constructor [x~e, y~f] has the type of the corresponding declaration, [x: De, y:Df].There is one more link in the chain. An application f[x] has an arbitrary expression for f, notnecessarily a l-expression. The requirement is that f must have a proc type, say D_R; D is thedomain type and R the range type. Since the type of l D=>R IN e is D_R, satisfying theprecondition D for the application is the same as satisfying the precondition D for the l-expression,and similarly in reverse for the postcondition. The value of f may be a primitive rather than aclosure obtained from a l-expression. In this case, the implementation of the primitive can stilldepend on the precondition and must still establish the postcondition, but since the implementationcannot be examined (within the framework of Cedar) we can say nothing about how this isaccomplished. Example: INT.PLUS, which is implemented by the machines 32-bit add instruction.In a proc type D_R, D and R may be declarations which provide names for the arguments andresults. In general, the expression R may include names declared in D. The range type of anapplication then depends on the argument values. Restriction: In current Cedar only modules have such types; the type returned by an interface, or the interfaces exported by an implementation, may depend on the interface and implementationparameters.As a by-product of the type-checking proof rules just given, a syntactic type is derived for everyexpression e in the program. It is denoted by De, and computed as follows:for a name, the declared type; for a literal, its type;for an application, the range type (which may depend on the argument); for a l the obvious proc type;for a binding constructor, the declaration obtained by pairing the names with the syntactictypes of the value expressions.Typechecking ensures that whenever e is evaluated, the resulting value will have type De (though itmay have other types as well, i.e., it may satisfy other predicates). The main use of syntactic types isin connection with dot notation (see 2.4.4).In order to carry out the proofs described above, the compiler must either compute the values of alltypes, including those denoted by complex expressions such as ARRAY [i..j] OF INT, or it must beable to prove the equality of unevaluated type expressions. For the most part, current Cedarrequires the former approach; hence a type expression must have value which the compiler cancompute. Such a value is called static; the rules for static values are given in 3.9.1. f2tGF{ q_<]xqxqxqtq$\Fyq=xq Zxq%XxqxqxqxqtqxqW-xqTyq yqxqxqxqxqtqxqxqxqzqxqxqSQPxqxqxqxqxqzxqxqOuzxq LJ.xqxq"xq J yqxqxzxqxq IB xqyqxqxqtqxqxzxq G xqxqyq F:9xq! Dyq < C2@ A3 @* tquq> < xzxqxqxq$ ;{xqxq 9 rq 6r q? 5H8! 3 0.r q / xq"zxq,*e( G%yq#]E!  xq,zxq e y, N3/ tqxqxqtqtq F7! "2 >rq4 TVk(2.4THE TYPE SYSTEM232.4.3 MarksBy this point you may have thought of asking why the assertions provided by type predicates areworth all this fuss. The reason is simple: they are the basis for authenticating values of an abstracttype, so the implementation can be sure that it is working on properly formed values. Suppose youare the implementer of an abstraction, e.g., Table. You provide operations to Lookup a key in thetable, to Insert a [key, value] pair, and to Enumerate the items in the table. A Table is implemented asa REF to a record containing a sorted array a of items and an INT n which gives the number ofitems. Lookup is implemented by binary search. All three operations are programmed on theassumption that elements 0 through n1 of a are sorted, and that n is smaller than the size of thearray. They will not work properly if these assumptions are not satisfied, and indeed they may tryto subscript the array with an out-of-bounds index or to violate other requirements of theabstractions they depend on.Here is a lower level, but perhaps more dramatic example. The dereferencing operation ^ for a REFREAL returns a VAR REAL, which can, for instance, be assigned to, as in the program fragmentr: REF REAL_NEW[REAL_1.0];. . .r^ _ 3.14159A REF REAL is represented by the address of a four-byte block of storage which holds a REAL, andthe assignment to r^ stores the four bytes which represent 3.14159 into that block. If somehow aREF BOOL finds its way into r, the assignment will still store four bytes, since it doesn't know anybetter. But the REF BOOL points to a two-byte block; the other two bytes that will be modifiedbelong to some unrelated variable, which will be clobbered without warning. The second example is scarier because the consequences of the bug seem more unpredictable. Inboth cases, however, the fundamental problem is the same: even if the implementation is correct,the wrong thing happens because it is given an improper value to work on. Or to make the samepoint in different words, the implementation cannot be held responsible for bad results from one ofits operations, if it has no control over the validity of the arguments it receives.So that the implementation of an abstraction can take responsibility for correct operation, theremust be a way to authenticate a value of the abstract type. In Cedar this is done by placing a markon the value; think of it as a little flag stuck into the value. The mark uniquely identifies theabstract type, and authority to affix it is under the control of the implementation. A correctimplementation will mark only values which have the properties needed for a representation of anabstract value, and if no one else can affix the mark, the implementation can be sure that everyvalue with the mark has the desired properties.A mark can be thought of as an abbreviation for an assertion or type invariant which characterizes aproper abstract value, such as Table or REF REAL. Such an assertion can be quite complex. In theTable example, it would say that the representation is a record of the proper form, that n is less thanthe array size, and that the first n array elements are sorted. In the REF REAL example, it would saythat the address points to a block of storage such that at least the first four bytes don't overlap anyother blocks. Such assertions are not easy to write down formally, and proving them is certainlybeyond the power of any existing program. So the abbreviations are not a mere convenience, but anecessity.A new mark can be created on demand by the primitive CREATEMARK: PROC[Rep: TYPE, tag: UNIQUEID]_[m: MARK, Affix: [Rep]_[TYPEFROMMARK[m]] ]The primitive HASMARK tests a value for the presence of a mark, so HASMARK[x, m] tests x for thepresence of the mark m. Affix adds the mark to a Rep value. Restriction: MARK, UNIQUEID, CREATEMARK, HASMARK and TYPEFROMMARK are not accessible incurrent Cedar. Record and array type constructors provide some access to CREATEMARK, asdescribed below. The ISTYPE primitive, also described below, is closely related to HASMARK. f2t7UG G?q _rX \qJ [,R YB X$*xqxq Vxqxqxq Utq'xqtqxq SxqL R xqxqxq P5' O T M  J],.t Hq uqtqEyGUxqXtGqtqtqyEXyDMxq Btqtq/tq AExq9 ?tqtq xqG >=tqtq0 <F 93' 8 S 6Z 5J 3~Q 0SZ . r q/r -Kq(7 +L *C ; (X ';* $;r q "xqtqtq0 !xq1#xq xq#tqtq P |? B t I3yXu qtqxqtqxquqzqxquqxOqXxqzqu qxq A uquqxqxq  xqxqxq r quququ ququ q 2u q  tq8uqTVk(THE KERNEL LANGUAGE2.324With these facilities, it is easy to create a new abstract type. Choose its representation type, andobtain a new mark m. TYPEFROMMARK[m] with an appropriate cluster added is the new abstract type.The implementation must use Affix to mark only values which satisfy the properties it demands.The type returned by TYPEFROMMARK[m] has the predicate l [x: ANY]=>[BOOL] IN HASMARK[x, m]and an empty cluster. Except for subranges and bound unions, all types in current Cedar have apredicate of this form. The built-in types (INT, BOOL etc.) come with such predicates, and the built-in type constructor procs (ARRAY, RECORD etc.) obtain a mark from CREATEMARK. So that twoinvocations of ARRAY [0..10] OF INT will produce the same type, ARRAY and most of the otherconstructors use a canonical encoding of the constructor and its arguments for the UNIQUEID, andhence are functional. RECORD and ENUMERATION produce a different type each time they areinvoked, so they obtain fresh unique identifiers. Since the program cannot invoke CREATEMARKdirectly, we need not explain how to prevent forgery of UNIQUEIDs. Future versions of Cedar will addressthis problem.In current Cedar you make a new abstract type by declaring in as an opaque type in an interface:T: TYPE[ANY]This generates a new mark, and declares T to be a type which has that mark. You get such a typeby explicitly painting some other type, normally in an implementation which exports T to theinterface which declared it:T: PUBLIC TYPE~Interface.T PAINTED RECORD [...].See 4.3.4 for more details.The implementation actually stores a mark with each variable allocated by NEW. Such a variable canbe referenced by a REF, and in particular by a REF ANY value. The type of a REF ANY value can betested at runtime using the primitive ISTYPE: PROC[x: ANY, U: TYPE]_[BOOL]If De is REF ANY and RT=REF T, then the value of ISTYPE[e, RT] is TRUE iff the predicate for Tjust tests for mark m, and x^ has the mark VAR m. ISTYPE is described in detail in 4.3.1, along withthe WITH ... SELECT construct and the NARROW primitive, which are more powerful operations builtup from ISTYPE.For other values, there is no mark actually stored; instead, types must be computable staticallyusing the methods described in the last section. The AMTypes interface, however, gives a way torefer to any value in a uniform way, and to test its type at runtime.There is only room for one mark on a variable, and this must encode all the marks that the valueactually carries. We arrange for this by imposing a partial order on the marks, and requiring that:The set of marks on a value must have a maximal element.Every mark smaller than the maximal one must be on the value.With these rules, a single mark stored on the value is enough to code all the others.In current Cedar, a value actually has only one mark, since:The only way to create a new mark is with the record or enumeration type constructors, orby declaring an opaque type.When you paint a type T with the mark of an opaque type, T must be a record orenumeration type, and the opaque type mark replaces the mark it had before.Note that VAR T, READONLY T and T are different types with different marks, although VARTgREADONLY T, and there is a coercion VALUEOF from either one to T.f2tGF{ q _C ^W xqu qxq7 \xq= Yu qxqyX$yqXxqtqtququqxqxq VT U#tqtq % Stqtqu q R tqtqtqtq P +uq O tqt q M5u Lq/uqt% J GUq` ExqXtqtq DM$xq# B.$xq AE ?xqXtGqx qXtGqX >= ;Gtq 9tqtqtqtqtq 8 y6tqXtqxqtqxqtqzqtq 5zxqtqtqxqtqxqtqxqxqtqx 3~qxqxquqxqtq) 1tqtqtq 0vtq -KG +0xq *C@ '> %[#<8 rq8 U a< 8-xq!xq  rq ~tqxqtqxqxqt xztqxquqxq TVk(2.4THE TYPE SYSTEM252.4.4 Clusters and dot notationIt is convenient to associate with a type the procs supplied by its implementor for dealing withvalues of the type. This is done by putting these procs into the type's cluster. The cluster is simply abinding which is part of the type value (the predicate is the other part). There are no rules enforcedabout what goes into the cluster. However, there is a special dot notation which makes it desirableto populate T's cluster with procs which take a T as their first argument. The usual effect is likethis: t.n is sugar for Dt.n[t], and t.n[other args] is sugar for Dt.n[t, other args]. For example, if t has type T, and a proc [T, INT]_[BOOL] is in T's cluster under the name P, thenthe proc can be applied by an expression like t.P[3], which is sugar for Dt.P[t, 3]. The name P islooked up only in T's cluster, not in the current scope. If Q: [T]_[INT] is also in the cluster, it canbe applied with t.Q, which is sugar for Dt.Q[t].The general rule that makes this work is the following: t.n is sugar for LOOKUPC[Dt, $n][t].LOOKUPC[Dt, $n] is just Dt.n, except that if Dt.n is a proc that takes several arguments, it is splitinto a proc that takes the first argument and returns a proc taking the remaining ones. ThusLOOKUPC[Dt, $n][t] will be a proc taking the remaining arguments, and t.n[other args]=LOOKUPC[Dt,$n][t][other args] will be the same as Dt.n[t, other args].Dot notation can also be used to obtain values from a binding or from the cluster of a type withoutany application: T.P would be the proc named P in the previous example. The possible cases of dotnotation in current Cedar are described in detail in 4.14.Restriction: There is currently no way to explicitly construct clusters. The built-in types and typeconstructors have clusters; they are described in detail in 4. In addition, there is a clumsy way toprovide a cluster for an opaque or record type in an interface: every proc name in the interface isput into the type's cluster. For a record, the procedures supplied by the record constructor are alsoin the cluster, and they win if there are name conflicts. There is one of these clusters for each typein each imported interface value; if a module imports more than one value of the same interface,however, there are severe restrictions (see 3.3.3).2.4.5 DeclarationsA declaration is the type of a binding. Thus, the binding [x~3, y~3.14] has the type of the decl [x:INT, y: REAL]. All the relationships among types, and between types and values, are carried overelementwise to decls and bindings; the elements are matched up by name rather than by position.A decl itself simply has the type DECL.A decl is made up of two parts: the names or pattern, and the types. The basic operation formaking decls, MKDECL, takes a pattern and a type. Thus MKDECL[ PATT[x, y], INTXREAL]=[x: INT, y:REAL]. In general, a pattern is one of NIL, a simple name, or a pair of patterns, just like a Lisp S-expression. Similarly, a type argument to MKDECL is one of NIL, a type, or a cross type. The typemust decompose in a way which matches the pattern. Normally, as in Lisp, we deal only in flatpatterns, where the first element of a pattern is always a name. Such flat patterns are convenientlydenoted by constructors of the form [x, y, ...]. The reason for defining things in terms of pairs isthat it makes it much simple to write down precise rules for the semantics, using structuralinduction on the values.The main use of a decl is to type-check a binding. The basic binding constructor is MKBIND[d, e],where d is a decl and e is matching group or binding. If e is a binding, then its structure and namesmust match the structure and names of d, and each element of e must have the type demanded bythe corresponding component of d, after a possible coercion. Thus MKBINDD[[x: INT, y: REAL], [x~3,y~3.14]]=[x~3, y~3.14]. This may seem pointless, but it has two important uses:Such a binding is used to bind the argument of a proc to the domain declaration. Eventhough the resulting binding is the same as the argument, the type-checking is essential. f2t7UG G?q _rX \qM [,:( Y%: X$ T V xq xq2 Uxq zxqxqxqxqzxqxqxq Q xq xq xqtqzqtqxqxq Pm+xqxqzxqxqxqxq N xq)xqxqzqtq Me xqzxqxqxq J:5xq uqzxqxqxq Huqzxqxq zxqxqzxqxq G2)/ Euqzxqxqxq/xq vuqzxq D*xqxqzxqxq @F ?{ xqxqxq3 =4 :r qO 9H "8 7"rq 6@B 4J 38rq< 1- -rX *q+xqxq!xq )tqxqtq J ' = %!uq " rq( !Ouq#uquqxqxqtztqxqtqxq tq tq; G uq tqr q !8 ?> xqxq; 7.*   Euqxqxq xqxq!xq+  xqxq xquqxqtqxqtqxq xxqxqxq? ? STVk(VTHE KERNEL LANGUAGE2.326There may be coercions involved, so that the resulting binding is not the same. Coercionson the component values are discussed in 2.6.1. There are also coercions on the bindingitself, which can default missing elements; these are discussed in 2.3.6.If e is a group, it is first coerced to a binding by attaching the names from the decl, as discussed in2.3.6. Thus in MKBINDD[[x: INT, y: REAL], [3, 3.14]] the second argument is coerced to [x~3,y~3.14], and things then proceed as before.Bindings may also be used in LET expressions. Here the types are often redundant, and it is betterto use the MKBINDP primitive to bind the value directly to a pattern. The syntactic type of the resultis the decl whose type is the syntactic type of the value. Thus [x~3, y~3.14] is short forMKBINDP[PATT[x, y], [3, 3.14]]; its syntactic type is MKDECL[[x, y], D[3, 3.14]]=MKDECL[[x, y],INTXREAL]=[x: INT, y: REAL].A decl D in a block is interpreted somewhat differently. It becomes the argument of the NEWFRAMEprimitive, which turns the type of the decl D.T into the corresponding VAR type VT=D.T.MKVAR[],allocates a new value v of type VT, and makes the binding MKBINDP[D.P, v] over the scope of theblock. Thus {x: INT; y: REAL; S} becomesLET [x, y]~[VAR INT, VAR REAL].NEW IN SHere D=[x: INT; y: REAL], VT=[VAR INT, VAR REAL], and v=[VAR INT, VAR REAL].NEW. Note thatthe types might have defaults, which are used to initialize the values as part of the NEW operation.Actually this is a bit oversimplified, since NEWFRAME has to separate the bindings in the block fromthe decls, construct the variable binding just described from the decl, and then combine it with thebinding from the block. Thus {x: INT; y: REAL; z~TRUE; S}becomesLET [x, y, z]~([VAR INT, VAR REAL].NEW PLUS [TRUE]) IN Sor more readablyLET x~VAR INT, y~VAR REAL, z~TRUE IN SAnomaly: In Cedar the names in a block are introduced recursively, so that the d's and b's can referto each other. It is possible for a binding or type to refer to a value which has not yet beeininitialized, with undefined results. See 3.4.1 for a further discussion of this point.2.4.6 ClassesAnother important use of a declaration is to characterize the cluster of a type. Since the cluster isjust a binding, it is characterized by its type, which is a decl. When used for this purpose, a decl iscalled a class. See 4.1 for further discussion of classes, and an enumeration of the primitive classesof Cedar.2.5 ProgramsThis section describes how meaning is assigned to kernel programs.2.5.1 Structure of programsA kernel program is an expression, which is either atomic (a name or literal), or is an applicationwhich involves sub-expressions: the proc being applied, and the arguments. The concrete syntaxtreats certain kinds of expressions specially: modules, blocks (which introduce new variables andreturn no value), and statements (which return no value). All desugar into simple expressions,however, and are treated identically in the kernel.f2tGF{ q_6^WS\D Z{xqI Xuqxqtqxqtq/xq Wsxq# THuqB Ruq(, Q@?xqxq Ouquqxqxquqxqxqzq uqxqxq N8tztqxqtqxqtq K xqPu Iq "xuqtqxqxuquq H xqxquqxquqxq F DxqXtqxqtqxq Cy AuqxOqtG qtqXuqx @qqxqxqtqxqtqxqtqtqtqtqxqtqtqtqtqtq >Stq ;uq/ :>D 8 76xqXtqxqtqxqtqxq 5 4.uqxOqXxqtG qtqXtqtquqx 2q 1&uqXxtGxOqtGqXxqtquqx -rqV ,wW * L &rX #q^ "E[ rqK = jwX ?q> @rX qM #6 I X + TVk( 2.5PROGRAMS272.5.2 NamesA name is a part of a program which usually serves to denote a value. There are two contexts inwhich the occurrence of a name n denotes a value:It may occur as an expression. Then n denotes the value bound to it in the scope in whichthe expression appears (see 2.5.4 for details).It may occur after a dot, as in e.n. Then the expression e.n denotes the binding for nsupplied by e (see 2.4.4 and 4.14 for details):the value bound to n in e, if e is a binding;the value bound to n in the cluster of e, if e is a TYPE;roughly (De).n[e] otherwise.There are also two defining contexts for a name n (see 2.5.5 for details):It may occur before a ~ in a binding constructor, as in n~e. The value of e is the valuebound to n in the binding denoted by the constructor (see 2.3.5 for details).It may occur before a : in a declaration constructor, as in n: t. The value of t is the type ofn in the declaration denoted by the constructor (see 2.4.5 for details).These constructors are usually recursive in Cedar; that is, the expression n elsewhere in theconstructor denotes the value bound to n in that constructor; see 2.5.6 for details. In the kernelthey are non-recursive unless preceded by REC.A name is not a value, but there are values of type ATOM which are related to names. An atom hasa print name which is a rope (an immutable sequence of CHARs). A name following a $ is an atomliteral; $n denotes the atom with print name n. Other properties of atoms are described in 4.5.1.1.Caution: Current Cedar has several complications in its treatment of names:In an argBinding27, n: e may be written instead of n~e. The syntactic context distinguishesthis from a declaration, but this usage is not recommended.An argBinding is not recursive: in {a~1; f[a~3, b~a+1]} b is bound to 2, not to 4.The declaration in an import list is non-recursive: IMPORT M is short for IMPORT M: M,and the second M denotes its binding in the currounding scope (i.e., the binding suppliedby the DIRECTORY). Inside the body of the module, of course, M denotes the importedparameter. Names which appear in an enumerationTC54 are treated specially; see 4.7.1.1 for details.2.5.4 ScopeA scope is a region of the program in which all names retain the same meanings (note that manynames denote variables, which can change their values in the same scope, but each name continuesto denote the same variable). In the kernel there are only three constructs which introduce a newscope, l, LET and REC. In current Cedar, these are sugared in a variety of ways: modules, importlists, proc bindings, blocks, exit labels, open, iterators, safeSelects and withSelects. All havestraightforward desugarings, however.2.5.5 ConstructorsThe kernel has constructors, denoted [...], to make expressions which denote group, decl andbinding values more readable. There is one flavor of constructor for each class:A binding constructor is a list of binding elements (b in the kernel syntax) of the form p~eor d~e. The presence of the ~ distinguishes it from the others. Here d is a decl element f2t;G?q _rX \q^ [,xqX xq4WP.TxqxqxStqxq&Qcxqxqxq ORxqxqxqtqMAzxqxqxq JrqxqG6xqxq xq F:xqEC:xqxq xqB^xqI @rq#xq > xq< <&uq 93tq 8Or q+tq r 6qxq"xq7 3rqD1H 1}1Hqxqxqxqxq%/7-l$xqxqxqxqxqxq+ %tqxq tqxqxq) xq-( tqxq& $0&$}$0q2 1rX qM *rq(  T zyququq5 V r srX HqY I l"rqxqxqxqxq1 xq TVk(THE KERNEL LANGUAGE2.328(not a declaration), and p is a pattern, in which the names are being defined rather thanevaluated.A decl constructor is a list of decl elements (d in the syntax) of the form p: t. The presenceof the : without any ~ distinguishes it from the others. Again, p is a pattern.A group constructor is a list of expressions. Note that decl and binding elements are notexpressions, although constructors are expressions.Constructors are useful for making decls and bindings where the names are literal. This is thenormal case, and in fact the only case in current Cedar. If you want to make them out of otherdecls, for instance to bind an expression to a decl which is tha value of a name dn, you cannot use aconstructor; [dn~e] would bind the value of e to the name dn, not to the decl which is its value.You have to write the decl-constructing primitive directly: MKDECL[d, e].The only kinds of constructor you can write in current Cedar are:Decl constructors for proc domains and ranges, and for records and unions (fields43 in thesyntax).Binding constructors for arguments in an application, or as an expression alone if a recordor array value is needed (argBinding27 in the syntax).2.5.6 RecursionIn the kernel, you get recursive definition of names only if you write REC (or the unsugared formFIX) explicitly. In Cedar, on the other hand, decls and bindings are normally recursive, except forargBindings and import lists.The recursion is legal in a block or interface body (although anomalies are possible in some caseswhen names are used before they are defined; see 3.4.1). In fields it is illegal.2.6 Conveniences2.6.1 CoercionA coercion is a proc which is automatically applied under some circumstances to map a value ofone type T (called the source) to a value of another type U (called the dest), e.g. from [0..5) to INT.Coercions are obtained from the clusters of the types involved. The coercion mechanism adds nonew functionality, since the programmer could always write the applications himself, but it isimportant in concealing some of the distinctions made by the type system when they are distractingrather than helpful.There is exactly one (desugared) context in which a coercion is applied: when an expression e ofsyntactic type T appears as an argument in an application which expects a value of type U; thismeans that there is a binding n: U~e. Since nearly all Cedar constructs are desugared to application,coercions are widely applicable. The only (desugared) context in which there is no coercion is forthe first operand of dot, since in that case the cluster of the operand is used to interpret the namewhich is the second operand. Thus in the expression e.n, it is always De, the syntactic type of e, thatis used to look up n, regardless of the fact that this expression may appear as an argument to aparameter of type U. If e is not a type or binding, however, then e.n desugars to P[e], whereP=LOOKUPC[De.Cluster, $n],and in the application of P, e does appear as an argument and can becoerced. Usually the cluster for T is set up with procs which take an argument of type T, so thedomain of P is De and no coercion happens. This isn't always true, though; a subrange T of INTinherits the arithmetic procs of INT, for example, and there is a coercion from T to INT when PLUSis applies.f2tGF{ q_rqxq3 ^W [r qxqxq Z{>xq X#UrVq rq St N Q2& PlKxq N xqxqxq xq% Md9uqxqxq J9AG.Hn}GqF]D!3B"C}Bq >rX ;Wq>uq 9uq@ 8O  5$Y 3O .wX *r 'q2+ &xq rqxq rq tq $C #F !D  Txq `xq/xq  xqxqxqA XC L P/xq zxqxq xq1 Hxqxqxq xqxq xquqzxqxqxqxqxq @xq,xq xqzxq1xqt 8qtq,xqxtqu q TVk(2.6CONVENIENCES29If TgU it is sometimes natural to think in terms of a coercion from T to U that is implemented bythe identity function. In fact, implication is stronger than that, since it propagates through manytype constructors, including PROC, when coercion does not. Implication is discussed in 2.4.2 and4.12.There is a rather general rule for finding coercions from the clusters of types, though it is not ofmuch practical importance in current Cedar, since there is no way for the user to define coercions.The rule goes like this. Each cluster may have a From item and a To item. T.From should consist ofpairs with type [tag: ATOM, proc: T_U], and T.To of pairs with type [tag: ATOM, proc: U_T]. Ignorethe tags for the moment. Consider the binding n: U~e, where De=T, and TgU is false. For eachproc P in T.From or U.To we try n: U~P[e]. If P: T_V is in T.From, it maps e to a value of type V, and we have to bind n: U~P[e]. IfVgU we are done; otherwise we can recurse on this sub-problem.If P: V_U is in U.To, we have to bind m: V~e. If TgV we are done; otherwise we canrecurse on this sub-problem.The whole process fails if no path of coercion procs takes us from T to U. The search can terminatewhen all paths have been explored, and a particular path can be abandoned when a type appearson it for the second time. Since the search is done statically (by the compiler), and since the resultsof an attempt to coerece T to U can be cached, the time required for the search is not a problem.There are two obvious difficulties with this scheme. First, it may transform erroneous applicationsinto legal ones, but coercing an argument is ways not intended by the programmer. Second, morethan one path of coercion procs may exist, and different paths may give different results. Thesecond difficulty can be avoided, and the first minimized, if every coercion proc P is chosen so thatit has a (partial) inverse, and P1[P{x]=x for all x in P.DOMAIN. This says that a coercion does notlose information, and that different paths give the same answer. Sometimes this is not feasible, e.g.for the narrowing coercion from INT to [0..5). The following rule gives the builder of clusters controlover proliferating coercions:If two procs on a coercion path have non-nil tags, they must have the same tag.In general, coercions that don't lose information can have NIL tags, and others should have differenttags. The coercions in current Cedar are described in 4.13. All have NIL tags, and none losesinformation except the subrange narrowing. Note that coercions extend componentwise to groupsand bindings.2.6.2 ExceptionsThe basic idea behind exceptions is to extend the value space, so that it includes not only ordinaryvalues, but also a set of exception values. An exception value has the special property that wheneverit appears in an application, it becomes the value of the application, so that it propagates upthrough the control stack of the program until it finally abecomes the value of the whole program.Of course this isn't always what is wanted, so there is a special HIDE construct which is not anordinary application, but takes its argument value, ordinary or exception, and bundles it in a variantrecord which is a normal value. Then ordinary code can be used to test for the exception and takeappropriate action. This construct is sugared to give distinctive ways of catching an exception: in thekernel with BUT (2.2.4), and in Cedar with ENABLE, EXITS and REPEAT (3.4.2). Cedar has twokinds of exception: GOTO labels and ERRORs, which must be raised and caught separately, and haveslightly different semantics. The main point of this treatment is that it does not require continuations or any other baroqueexplanation of how control is transferred to catch an exception. The view is that exceptions aresimply a convenience feature; the same job could be done by returning a slightly larger result fromeach proc, with an appropriate status code. f2t8 G?q _xzxq=xqxq ^WM \tq2 [O X$B VT U-xq xqxq S xqtqxqxzxqxqxqtqxqxzxq Rxq&xqxqxqzxqxqxzxq PxqxqxqxqxqxqxqxqxqN8xqxzxqxqxq xqxqxqxqxqxqLxzxq;J\xqxzxqxqxqxqxqxzxqH F@xqxq DE CxS AxqxqB >V =E> ;*0 :=Fxq 8x9F}8qxqxqxqxqxquq 75K 5rqtq 9 4-1-xqxq /}9tq' - *>tq )J R ' #rX qC rq - D E 7uq  Q =  ?rq |uq tqtqtq tq tq2 t IT 5 AF 'TVk(THE KERNEL LANGUAGE2.330An exception consists of a code and an optional argument value. The type of the code is ERROR T,where T is the type of the argument which does with it. GOTO labels always have empty arguments.The argument is a way of passing some information along in addition to the identity of theexception.A proper treatment of exceptions in the type system would require that each proc range include allthe exceptions that can emerge from an application of the proc. In fact, this is not required or evenpossible in current Cedar.Cedar also has signals, which historically were viewed as a kind of exception but now have a verydifferent interpretation, as a way of obtaining dynamic rather than static scoping for names. Theyare discussed in 3.4.3.1.2.6.3 FinalizationThis subject is discussed in 3.4.3.1.2.6.5 ConcurrencyThis subject is discussed in 4.10, where the Cedar facilities for writing concurrent programs aregiven. Writing good concurrent programs, or even correct ones, is another matter, which is beyondthe scope of this manual to more than hint at. Unfortunately, an adequate reference is lacking.2.7 MiscellaneousThe different kinds of allocation are discussed in 4.5. Static values are defined in 3.9.1.2.7.1 PragmasA pragma is a construct that does not change the meaning of the program, except perhaps to makesomething illegal which was legal without the pragma. Its purpose is to affect the implementation,generally by requesting optimization to favor one criterion over others. The pragmas in currentCedar are:INLINE, which causes a proc body to be expanded inline when it is applied. See 3.5.1 fordetails.PACKED, which causes array components that fit in 8 or fewer bits to be packed, and theexpense of more expensive code to access them.CHECKED, which forbids application of unsafe procs in a block, and adds runtime checkingfor some primitive procs which are otherwise unsafe (in particular, narrowing to a subrange,and assigning a proc).PRIVATE, which forbids access to items in an interface or instance except to modules whichEXPORT (or SHARE) it.MACHINE DEPENDENT, which allows positions of record fields and representation values forenumeration elements to be specified (strictly, it is the absence of MACHINE DEPENDENTthat is the pragma)f2tGF{ q _rqrqtqxq ^Wxqtq$ \C [O X$M Vb U Q rq2 Pm5$ N JrX Gq# CrX @qG ?; =\ 9rX 6cq\ 2drX /9qY -G ,1V *(Utq7&$ytq="' tq?J=tq4tqtqatqtq/ 1tqtYqTVk($2.8RELATIONS AMONG GROUPS, TYPES, DECLARATIONS AND BINDINGS312.8 Relations among groups, types, declarations and bindingsCedar has are four closely related basic ways of building product values from simple values (all aregiven precise meanings in 2.2.1 and 2.2.2):a group is simply an n-tuple of values (see 2.3.4);a X-type is the type of a group (if x: T and y: U then [x, y]: TXU) (see 2.4.5);a binding is an n-tuple of [name, value] pairs (see 2.3.5);a declaration is the type of a binding, an n-tuple of [name, type] pairs (see 2.4.5).Figure 21 illustrates the relations among these kinds of objects. In current Cedar most of theseobjects can be constructed and manipulated only as interfaces and instances. In the kernel and themodeller, all of them are first-class citizens. The primitives which go between them are defined in2.2.[a: Ta~ea, b: Tb~eb]:[a: Ta, b: Tb][a: TYPE~Ta, b: TYPE~Tb] bindingut"u%5t3?u5t=fq=FA=`F=.F=fpuqp&y=0F'=fr+=F-=fr/=mF1Y=fpq=0s ;qv;Ux;qv;Ux;qs!2v;Ux;uv;Ux1;qv;Ux;qv;Ux;q9s!s/?s5yX7 0s( -q? , yq9 * (-<&2$$vq"7yq v$yqvqvqvqvqvqvq vqvq1 )! vq vqvq :vqvqvqvq0 c ,TVk(SYNTAX AND SEMANTICS332Chapter 3. Syntax and semanticsThis chapter gives the concrete syntax for the current Cedar language, together with an informalexplanation of the meaning of each construct, and a precise desugaring of each construct into thekernel language defined in 2. The desugaring, together with the definitions of the kernelprimitives used in it, are the authority for the meaning; the informal explanation is just for yourreading pleasure. However, paragraphs beginning Anomaly or REstriction document properties ofCedar not captured in the desugaring. The primitive procs and types of Cedar are specified in 4. In addition to the grammar rules and desugaring, there are examples for each construct. These areintended to illustrate the constructs and do not form a meaningful program. The Cedar Manual haslonger examples which do something interesting, and also illustrate the use of the standard Cedarpackages.There are several summaries which may be useful as references:A two-page summary of all the syntax, desugaring and examples in this chapter(CLRMSumm.press). A one-page summary of the full syntax (CLRMFullGram.press).A shorter and less cluttered summary of the syntax for the safe language; it also omits anumber of constructs which are obsolete or intended only for efficiency hacking(CLRMSafeGram.press). The chapter begins with a description of the notation (3.1) The next sections deal systematicallywith the rules of the grammar, explaining peculiarities of the syntax and giving the semantics:3.2, rules 56-61: The lexical structure of programs. 3.3, rules 1-5: Modules.3.4, rules 6-10: Blocks, OPEN, ENABLE, EXITS.3.5, rules 11-13: Declarations and bindings.3.6, rules 14-18: Statements.3.7, rules 19-27: Expressions.3.8, rules 28-35: Conditional constructs: IF and SELECT.3.9 treats various miscellaneous topics. 4 deals with the syntax and semantics of types.The order of the grammar rules is: module, block, declaration, statement,expression, conditionaltype, name, literaland top-down within these. f2zG GN q_{j Y?q8$ W &0 V7< T 3& S/)yqy q Q^ N[ LU KxA I F9Dq%<&Bv q@'vq>==<,!-(;5vq 8 "> 6[4.71/~zqzqzq-&.*(v&,zqzq #\ #& TVk(3.1 NOTATION333.1 NotationThis section describes the notation used in the grammar, the desugaring, and the commentary ofthis chapter.3.1.1 Notation for the grammarThe grammar is written in a variant of BNF:Bold parentheses are for grouping: ( interface | implementation).Item | item means choose one.?item means zero or one occurrences of item.item; ... means zero or more occurrences of item separated by ";". The separator may also be ",",ELSE, IN, or OR, or it may be absent. If the separator is ";", a trailing ";" is optional. item; !.. is just like item; ... but there is at least one occurrence.A terminal is a punctuation character other than bold ()?|, or any character underlined, or a word inSMALL CAPS. Note that [] and {} are terminals, and do not denote optional occurrence and repetition as they do in manyother variants of BNF.The rules are numbered sequentially. Special symbols mark constructs with special properties:=unsafe;=obsolete;=machine-dependent;=efficiency hack.The grammar is written so that a non-terminal never expands to the empty string. When an elementof a rule is optional, that is always indicated explicitly by "?" or "..." .The following non-terminals are so basic to the language and so frequently used, that they arerepresented in the grammar by abbreviations:b=binding13d=declaration11e=expression19n=name56 (identifier)s=statement14t=type36I'm afraid this means that you must learn the meaning of these six abbreviations in order to readthe grammar.With the exception of these abbreviated non-terminals, each use of a non-terminal is cross-referenced with a small superscript number59, unless the non-terminal is defined in one of the nextfew rules. If a non-terminal (other than e, t or n) is used in more than one rule, then all the rulesthat use it are listed in a comment after its definition. Except for the entries in Table 31, a terminal symbol appears in only one rule. These duplicationsdo not lead to ambiguity. In most cases they are harmless, since the symbol has essentially the samemeaning in each case, and the rules are separate only for greater readability, to highlight an unusualuse of a construct, or for historical reasons. In some cases, however, the symbol has quite differentmeanings in different rules. These are marked on the left as followsThe rules marked with are obsolete and should be avoided.6In the rules marked with * the symbol has a different meaning than in the others, andconfusion is quite possible. The programmer should bear these cases in mind.0In the rules marked with * the symbol has a different meaning than in the others, but thecontext is sufficiently clear that confusion is unlikely.A superscriptxn indicates that the terminal is repeated n times in that rule. f2zG;G?q _sX \q4& [, W-yX Tq( R~Xsq sqsq Psq Ovsq, MsqY LnzqtzqzqL Jsqsq& If6sq 9I.w?xIf GzGqXzG&|z= F E qX" C8B@ ?= :a@ 8=sqsq 5C 4. ! 2sq2w 1&sq 1lw /sq /w .sq.dw.qX ,sq ,w +sq+\w )qL ( $P #_ #w#_q7 !%= W6 ,] C $? G <;.l}qBC.}qC 2 wx q)vq TVk(}SYNTAX AND SEMANTICS334SymbolsRulesExplanation0( )19, 25, *51.1, *54expr, subrange, *position in record or enumeration[ ]19, 25, 37, 43, 51constructor/built-in/funnyAppl, subrange, application, typeName, fields, mdFields0{ }2, 6, 8, 13, *54interface body, block, enable, machine code,*enumerationTC,2,3, 6, 7, 9, 17, 27, see note in 3.2.29,30,32, 34, 35, 43, 51, 52;6, 8,10, 17, 27.1, 30,see note in 3.2.33,350:1, 2, 3, 5, 7, 11, 13,introducing names with types, except *51.1=position, 18, 27, 33, 34, 51,7=open, 27=argBinding 34=withSelect*51.1, 53.19, 37dot notation for e is repeated for types0..25x4, *51.1subrange, *positiont0*21, *53infixOp, *tag+21, 58infixOp, exponent20, 21, 58prefixOp, infixOp, exponent=13, 22binding, infixOp=>6, 9, 17, 31, 33, 35, 52exits, enable, repeat, select choicesx4, unionTC6_14, 16, 18, 21, *55s, e_STATE, iterator, e, *defaultTC0~2, 3, 13, 20, *22, *27interface,implementation,b,argBinding,*unaryOp,*relOp~~7, 34open, withSelect6ANY*9, 40, 43*enable, variableTC, fields6CODE*13, 23*new exception, convert t to eENDCASE31, 52select endChoice, unionTC0ERROR*19, *24, 41.1*expression, *funnyAppl, transferTCIMPORTS2, 3 interface and implementationIN18, 22iterator, relOpLONG38x2, 45.1, 48cardinal/unspecified, pointer, descriptorNOT20, 22prefixOp, relOpNULL14, 27, 52, 55statement, argBinding, unionTC, defaultTCPACKED44, 45array, sequenceSELECT FROM29, 32, 34, 52selectx3, unionTCSHARES2, 3 interface and implementation0SIGNAL*24, 41.1*funnyAppl, transferTCTRASH27x2, 55x2argBinding, defaultTCTRUSTED6, 13block and machine code6USING1, *5directory, *locks0WITH*32, 34*safeSelect, withSelectTable 31: Terminal symbols appearing in more than one rule3.1.2 Notation for desugaringThe right-hand column is desugaring into the Cedar kernel language, or in a few cases intocomments describing the meaning in English. This is a purely textual transformation; i.e., it is doneon the text of the program, not on the values. The rewriting is done one rule at a time; a singlestep of rewriting involves elements from exactly one rule. The desugaring is specified by slightlyinformal but straightforward rewriting rules, in which:An occurrence of a non-terminal (written in bold) denotes the text produced by that non-terminal in the grammar rule.A | reflects a corresponding alternation in the grammar rule, ? reflects a correspondingoptional item in the grammar rule, and (bold parentheses) are for grouping as in a grammarrule. As in grammar rules, literal parentheses are underlined.f2zG GN q `3a_a `3 _ `3 _ $`3"_"(`3_ ^zy% ]F ] VF] F%]"sF [} q%2 ZW%*%X( W} q%,%V4 T%SrR P%OO M} q%5 L%&K, I%( Hj} qHwHjq%} G q  E% DG % B % A%%AwAq @$} q%zq >} q%5 =b% <} zq % :} zq% 9?zq% 7} zq %# 6}zq% 5zq% 3zq4w3q %) 2Zzq% 0 zq%, /zq% .7zqzq %.}w.7q ,zq% +u} zq% *zq*Zw*q*Zw%*q (zq% 'R} zq% %} zq% %ca$a %c $ %c $ %%c"s$"s& yX6  gq7 yq! _yqyq# ,2 W/s q9{ #sq:sq sqsq! 9 TVk(e3.1 NOTATION35Everything else is taken literally.An underlined non-terminal in the right column means that the desugaring specified for that non-terminal must be done in order to obtain a legal program. Otherwise the transformations can bedone in any order, yielding a legal program at each step.Every occurrence of e (expression) and t (type) in the desugaring is implicitly parenthesized, so thatthe desugared program parses as the rewriting rule indicates. To reduce clutter, these parenthesesare not written in the desugaring rules.For type options like PACKED, the desugaring of the construct in which they appear is a call on abuilt-in a type constructor which takes a corresponding BOOL argument defaulting to FALSE; if theattribute is present, the argument is supplied with the value TRUE.Examples: the following rule for subranges:subrange ::= ( typeName | ) (( [ e1 .. e2 ] | [ e1 .. e2 ) ) |(typeName | INT).MKSUBRANGE ( [e1, ( e2 | e2.PRED ) ] ) |( ( e1 .. e2 ] | ( e1 .. e2 ) ) ) [e1.SUCC, ( e2 | e2.PRED ) ] )generates these desugaringsIndex [ 10 .. 20 ]Index.MKSUBRANGE[10, 20]Index [ 10 .. 20 )Index.MKSUBRANGE[10, 20.PRED ]( 1 .. 100 )INT.MKSUBRANGE[1.SUCC, 100.PRED ]Names introduced in the desugaring are written with one or more trailing prime ("(") characters.Such names cannot be written in a Cedar program, and hence they are safe from name conflicts.The desugaring is constructed so that the ordinary scope rules prevent multiple uses of these namesfrom being confused.3.1.3 Notation for the commentaryEach section of the commentary begins with grammar rules, desugaring and examples for part ofthe language. It continues with text which explains the meaning of the constructs. Generally themeaning is fairly clear from the desugaring, and this text is short. For blocks and especially formodules, however, there are many non-obvious implications of the desugaring, and a number ofrestrictions; these constructs have a lot of explanatory text.Some kinds of information are put into specially marked paragraphs, which begin with one of thefollowing italicized words:Anomaly: the meaning of this Cedar construct is not explained by desugaring into thekernel, but by the special rule given here.Caution: here is an implication of the definition which might surprise you.Performance: facts about the time or space required by some construct.Representation: the values of a data type are represented in terms of other types like this.Restriction: a construct is not fully general, and will cause a static error unless theadditional conditions stated here are satisfied.Style: advice about good Cedar style.Symbols written in SANS-SERIF SMALL CAPITALS are in the kernel but not in current Cedar. Thesuperscript notation used to cross-reference non-terminals in the grammar is also used in theexamples, usually to point to a rule whose example introduces a name. f2zG;G?q_  ] /]K]. [yqN Z{5 WPsqsq> US TH% QzqD O0zqzq N5zq J+ GsXqsq sqzGs F;sqXEwF;qEwF;qEwF;qEwF;qFyF;sqs*sqsqzsqt qsqsEwF;qsqsEwF;qsqsEwF;qzqsqsqs DMq DydDMCwDMqCwDMqDyDMCwDMqCwDMqX *t q =F*t qzq ; *zqt qzqzq 8Luq 7Y 5/1 4  0 yX ,qV +]'6 )L (U4 & 1 #P ""yq2F$yqDy q;>y qNy q2b & yq  t~t~t~tq0 [ F < NTVk(SYNTAX AND SEMANTICS336 3.2 The lexical structure of programs56 name ::= letter (letter | digit)...-- But not one of the reserved words in Table 32.57 literal ::= num ?( ( D|d | B|b ) ?num ) |-- INT literal, decimal if radix omitted or D, octal if B. |digit (digit |A|B|C|D|E|F) ... ( H|h ) ?num |-- INT literal in hex; must start with digit. |?num . num ?exponent | -- REAL as a scaled decimal fraction; note no trailing dot. |num exponent |-- With an exponent, the decimal point may be omitted. |' extendedChar | digit !.. C |-- CHAR literal; the C form specifies the code in octal. |" extendedChar ... " ?L |[ ('extendedChar), ...] -- Rope.ROPE, TEXT, or STRING. |$ n-- ATOM literal.58 exponent ::= (E|e) ?(+ | ) num-- Optionally signed decimal exponent.59 num ::= digit !..60 extendedChar ::= space | \ extension | anyCharNot'"Or\ 61 extension ::= digit1 digit2 digit3 |-- The character with code digit1 digit2 digit3 B. | (n|N | r|R) | (t|T) | (b|B) | -- CR, '\015 | TAB, '\011 | BACKSPACE, '\010 | (f|F) | (l|L) | ' | " | \-- FORMFEED, '\014 | LINEFEED, '\012 | ' | " | \Examplesm, x1, x59y, longNameWithSeveralWords: INT;n: INT~1+12D+2B3+2000B-- = 1+12+1024+1024 +1H+0FFH;-- +1+255r1: REAL~0.1+.1+1.0E1-- = 0.1+0.1+0.1 +1E1;-- +0.1a1: ARRAY [0..3] OF CHAR~['x, '\N, '\', '\141];r2: ROPE~"Hello.\N...\NGoodbye\F";a2: ATOM~$NameInAnAtomLiteral;The main body of the grammar (rules 1-55) treats a program as a sequence of tokens. Rules 56-61give the syntax of most tokens. A token is:A literal57. More information about literals of type T can be found in 4, as part of thetreatment of type T.A name56, not one of the reserved words in Table 32. Note that case is significant innames.A reserved word, which is a string of uppercase letters that appears in the list of reservedwords in Table 32. A reserved word may not be used as a name, except in an ATOMliteral.A punctuation symbol: any printing character not a letter or digit, and not part of one ofthe two-character sequences below. The legal punctuation symbols in programs are: ! @ # $ ~ * + = | ( ) { } [ ] _ ^ ; : ' " , . < > / The following ASCII characters are not legal punctuation symbols (and must notappear in a program except in an extendedChar60): % & \ ? Note that Cedar uses a variant of ASCII which includes the characters _ (instead of the underbar ) and ^(instead of the circumflex ). Note also that the character written here is the ASCII minus character, code55B, and not any of the various dash or typographer's minus characters with other codes, which are not in thestandard ASCII set.f2zG GN q _sX& \w>szsXqsqsqs*q2 [,w>sXqsqZ[,squZ<[,sqDZ+[,sqZJ[,sqsqsqs*qzq5s YqsqsqYpYsqSYp:YsqYpYsqYpYsq_YpFYsqYpYsqsqsqYp Ysq Yp Ysqsqs*qzq(s X$q sqsq*zq4w>s VqX s*q7s UqsqsqTUs*qzq2s Sqsqsq|S`cSs*qsqsqvqzqzqzqs Rq*zq Pw>sXqsqPXPsqPXPsqsqsqsq*& O w>sXqs Mw>s Xqsq sq Lw>sXqKwwLqKwwLqKwwLqs*qsKwwLqsKwwLqsKwwLqs Jqsq#IJsqBI#eJsqmIJsqII RJsqsqsqPIoJsqIJsqsqsqI Jsq!I"Jsqsq*zqszGqXsqzqsq Hsq#HZyHsqHZHsqsqsqHZc`* <zq * ;X* 9zqzGqX 8Pzq sq 6zq 3Hyq 2'/0 w/q!vq$.Avq+,/w+q>*e( C& 9z%q"1(!)O; zq;'wqr azzCx)<Da#q(#z%z|B'>z jTVk(_3.2 THE LEXICAL STRUCTURE OF PROGRAMS37One of the following two-character symbols (used in the grammar rules indicated):~=not equal22<=less than or equal22~=greater than or equal22~>not greater than22=>chooses8,17,30,31,33,35,52..subrange constructor25,51.1~~bind by name6,34 f2zG*!G?q_N^W*X^w\q*]w[Oq* [wYq*ZwXGq*XwVq*W wU?q*UwSq* Tw2TVk(fSYNTAX AND SEMANTICS338ABSALLANDANYARRAYATOMBASEBEGINBOOLBOOLEANBROADCASTCARDINALCEDARCHARCHARACTERCHECKEDCODECOMPUTEDCONSCONTINUEDECREASINGDEFINITIONSDEPENDENTDESCRIPTORDIRECTORYDOELSEENABLEENDENDCASEENDLOOPENTRYERROREXITEXITSEXPORTSFINISHEDFIRSTFORFORKFRAMEFREEFROMGOGOTOIFIMPORTSININLINEINTINTEGERINTERNALISTYPEJOINLASTLENGTHLISTLOCKSLONGLOOPLOOPHOLEMACHINEMAXMINMODMONITORMONITOREDNARROWNEWNILNOTNOTIFYNULLOFOPENORORDEREDOVERLAIDPACKEDPAINTEDPOINTERPORTPREDPRIVATEPROCPROCEDUREPROCESSPROGRAMPUBLICREADONLYRECORDREFREJECTRELATIVEREPEATRESTARTRESUMERETRYRETURNRETURNSSAFESELECTSEQUENCESHARESSIGNALSIZESTARTSTATESTOPSTRINGSUCCTEXTTHENTHROUGHTOTRANSFERTRASHTRUSTEDTYPEUNCHECKEDUNCOUNTEDUNTILUSINGWAITWHILEWITHZONETable 32: Reserved words and predefined namesThe program is parsed into tokens by starting at the beginning and successively taking from thefront the longest sequence of characters which forms a token according to the rules above, after firstdiscarding any number of initial whitespace characters or comments. The whitespace characters are space, tab, and carriage return. A Tioga node boundary isalso treated as a whitespace character.A comment is one of:A sequence of characters beginning with --, not containing -- or a carriage return,and ending either with -- or with a carriage return.A Tioga node with the comment property.Note that whitespace and comments are not tokens, but may appear before or after any token; theyare token delimiters, and hence cannot appear in the middle of a token. Whitespace and commentsthus do not affect the meaning of the program except:When they delimit a token.Within a CHAR literal or a ROPE literal, where they are taken literally. Thus ' is equal to'\040, and "Iam --not--" is equal to "I\Nam --not--" and different from "I\Nam ".f2zG GN q `3a_a `3 _ `3 _ $`3"_"(`3_ ]z \F Z Y X# V Ua T R Q> O N| M K JY H G F6 D Ct B @ ?Q = < ;. 9 8l 7  5 4I 2 1]\FZYX#VUaTRQ>ON|MKJYHGF6DCtB@?Q=<;.98l7 54I21'G]'G\F'GZ'GY'GX#'GV'GUa'GT'GR'GQ>'GO'GN|'GM'GK'GJY'GH'GG'GF6'GD'GCt'GB'G@'G?Q'G='G<'G;.'G9'G8l'G7 'G5'G4I'G2'G15]5\F5Z5Y5X#5V5Ua5T5R5Q>5O5N|5M5K5JY5H5G5F65D5Ct5B5@5?Q5=5<5;.5958l 0a0&a 0 0& 0 0& %0"s0&"s,'yX) (q!; 'x#> % y qyq#**"#.$+1yq H kS 17zq zq2   /DTVk(3.2 THE LEXICAL STRUCTURE OF PROGRAMS39Both reserved words (Table 32) and most names with predefined meanings (Table 45) are madeup entirely of upper case letters. They should not be rebound by the program; in some but not allcases the compiler forbids their rebinding. All are at least three characters long except thefollowing:DO GO IF IN OF OR TO.A note on lists of items and their separators:Semi-colons are used to separate declarations, bindings and statements in a body10, and toseparate choices in a select statement29,32,34 or in an exits6, 17 or enable8,27.1. Commas are used to separate declarations in fields43,51 (i.e., in a proc domain or range, arecordTC or a unionTC), bindings in an application27 or an open7, choices in a selectexpression29,32,34 or in a unionTC52, expressions in a choice6,9,17,30,35,52, items in imports,exports or shares lists2,3. In general these sequences may be empty, and an extra separator at the end is harmless when thereis some kind of closing bracket, except when the sequence is bracketed with [].The braces which delimit a block6, interface body2, choices in an enable8, or MACHINE CODE body13may be replaced by BEGIN and END reserved words. BEGIN replaces "{" and END replaces "}". Ifone brace is replaced, its matching partner must also be replaced. The braces delimiting anenumTC54 may not be replaced by BEGIN and END.3.3 Modules 1 module ::= DIRECTORY (nd (: TYPE (nt | ) | ) l [ (nd : ( (TYPE nt | TYPE nd) | TYPE nd), ... ] IN ?(USING [ nu, ... ] ) ), ... ; LET (nd~RESTRICT[nd, [$nu, ... ] ] ), ... ( interface | implementation )IN ( interface | implementation ) 2 interface ::= nm, !.. : ?CEDAR DEFINITIONS LET r(~[ nm: INTERFACETYPE[[ nm, ...]] ] IN (imports | l=>r() IN?locks (imports | ) ?(SHARES ns, ...)-- SHARES allows access to PRIVATE names in ns.~ ?access12 { ?open7 (d | b); !.. } .LET REC nm~open [ ?(l(~locks, ) (d | b), ... ] IN MKINTTYPE[nm] 3 implementation ::= nm : ?CEDAR LET r(~REC [(ne: ne) , ... , FRAME: TYPE nm , nm: FRAME,CONTROL: PROGRAM] ?safety ( PROGRAM ?drType42 |IN (imports | l=>r() IN MONITOR ?drType42 ( | locks) )( | ( LET LOCK:MONITORLOCK IN LET l(~(l IN LOCK) IN | ))(imports | ) LET b(~NEWPROGINSTANCE[block].UNCONS IN?(EXPORTS ne, ...) [ (ne~BINDDFROM[ne, b(] ), ... , FRAME~MKINTTYPE[block], ?(SHARES ns, ...) nm~b( , CONTROL~b(.nm] where the block body is desugared:~ ?access12 block . [( | ( | l(~locks,)) (d | b), ... , nm: PROGRAM drType~{s; ...}]3.1imports ::= IMPORTS ( (niv : | ) nit ), ... --In 2, 3.l [( nit:nit), ...]=>r( IN LET (niv | nit)~ nit PLUS nit.BINDING4 safety ::= SAFE | UNSAFE --In 3, 41.5 locks ::= LOCKS e ?( USING nu: t)l ?( [nu : t] ) IN e f2zG*!G?q _O ^W_ \X [O Xzqzqzqzqzqzqzq U.Sty q0SwStqQR6w QqR6wQq R6wQqOyq,OwOqNNZwNq NZwNqL Lw LqLwLqLwLqK KRwK q H@ G0M DDKwDqDKwDqDKwDqzqzqDKw Bq zqzqzqzq @U ?y?w?yqzqzq :sX 7{w>sXqzqsq6x7{qsqzqsq6x7{qsqsqsqs*qTVk(%*`7{yqXw6{7{qwqwtqw6{7{qwqtqw6{7{wqwqtqw6{7{wqwqutGqX 5!wtq5{5qwqwqwqwq*uqw5{5quqw5{5qw5{5qwqwqwq 3wq wqw*uqwqwqwqw qw 2}>wXq1{12qwqwtqt q*uqzqw1{2qu qw1{2qwquGqXwqwqyqzwqu 0-wqwqwqwqwqwtq/{0-qw*qtqtq w/{0-q .?wq.}.?qwq.}.?qwqwqwqwq*uGqXw-{.?qwqwqzqwqwqwqwqwqwqtGuqXuqw-{.?q +}>w Xq*{+qwtqtG*uqXzquqw*{+qw*{+wqwquqtqw*{+q*)&w({)&quqtqtq '8wqwqtqwq'~}'8qw*uqwqwqyqzwquq2 %tGqXwq%}%qwqwqw*wqwqwquqtqt uGqXuqzqA`%|yA%yquqtqG%|yHQ%uqwqw $0qwqwq*uqzquqw:^#.=$0ququ "wtq"{"qw*qw"{"quqw"{"qzqwqwququqwH"t.K"q  wqwtq 1{ qw*qw 1{ qzqtqzqw 1{ q$ wq}q*wqwqwqwqzqwqwqwqwqwC{qtqwqwqwq }w qtqwqwqU{qwqwqU{qwqwqtG*yqXwqwU{qwU{wqwqzququqwU{qwqwU{wqwU{quqwU{qu }>wXqtqwqtqtG p}>wXqtqwqtq{pqw*yqwqw{pqwqwquqwTVk(MODULES3.340ExamplesDIRECTORY -- For BufferImpl below.Rope: TYPE USING [ROPE, Compare], -- There should always be a USING clause CIFS: TYPE USING [OpenFile,Error,Open,read],-- unless most of the interface is usedIO: TYPE IOStream,Buffer: TYPE;-- or it is exported.Buffer: DEFINITIONS ~Handle: TYPE~REF BufferObject;BufferObject: TYPE=Rope.ROPENew: PROC RETURNS[h: Handle];Get: PROC[h: Handle] RETURNS[BufferObject];Put: PROC[h: Handle, o: BufferObject];BufferImpl: MONITOR [f: CIFS.OpenFile] -- Implementations can have arguments.LOCKS Buffer.GetLock[h]^ -- LOCKS only in MONITOR, to specifyUSING h: Buffer.Handle-- a non-standard lock.IMPORTS Files: CIFS, IO, Rope-- Note the absence of semicolons.EXPORTS Buffer-- EXPORTS in PROGRAM or MONITOR.~ { -- module body -- } . -- Note the final dot.Modules serve a number of functions (which might perhaps better be disentangled, but are not):A file of text (BufferImpl.mesa), or its translation into object code (BufferImpl.bcd).The unit handled by the editor, named in DF files and models, and accepted by thecompiler, the binder, and the loader.A set of related structures (types, procedures, variables) which are freely accessible to eachother, hiding secrets or irrelevant information from other modules.A procedure which can accept interface types and bindings as arguments, and returnsinterface values as results.The first two uses are not relevant to the language definition, and are not discussed further here.The others are the subject of this section. There are two kinds of modules: interface modules (written with DEFINITIONS) and implementationmodules (written with PROGRAM or MONITOR). They have the same header (except that interfaceshave no EXPORTS list); it defines the parameters and results of the module viewed as a proc (3.3.1)and specifies the name nm of the module. The bodies (following the ~) are different. Table 33summarizes the structure of modules and their types; it omits a number of details which are givenin rules 1-3 and explained in the text.ExampleModuleModule typeResultResult typeDIRECTORY Rope, IO;Interface [Rope: TYPE Rope, IO: TYPE IO]InterfaceTYPE Match Match: DEFINITIONS~{...}module _[TYPE Match]DIRECTORY Match, Rope, IO;Implementation [Match: TYPE Match, ExportedMatch MatchImpl: PROGRAM module IO: TYPE IO, Rope: TYPE Rope, instance IMPORTS R: Rope, I: IO R: Rope, I: IO]_[Match] EXPORTS Match~{...}Table 33: Interface and implementation modulesf2tF{ q _r \tG*qXtGx qX [,tGqXtq *tq Ytqtq*) X$tq VtG*qX Sut q Qtqtq Pm tqt Nqtqtq Metq tq Ktq H tq*& G2tq*tGqXtq yEtq* D*tq*" Btq*tqtqtq A"* =W;xq(x q9G A75kC3=180  ,D +\) (17t q & tqtq4 %)tq%1 #x#{#q; ! *- 3% + + (+=vW+WCh+h r(X=vD |tqxOqX(xqtqxqxqtqxq=vDtqx qxqt q(zqtqxq tqxqxOq X(xqtqxq=vDx Iqxqtq(xqtqxqxqtqxO=vq XtqxqxOqXxq(xqxOqXxqzqxq Atqxq    ( =v D! ! r/ TVk(:3.3MODULES41The ensuing sub-sections deal in turn with:3.3.1: Modules as procedures, and the interface or instance values obtained by applyingthem.3.3.2: How modules are applied.3.3.3: Module parameters: the DIRECTORY and IMPORTS lists; USING clauses.3.3.4: Interface module bodies and interfaces.3.3.5: Implementation module bodies; the EXPORTS list.3.3.6: SHARES and access12.The meanings of the other parts of a module header are discussed elsewhere:CEDAR in 3.4.4.MONITOR and LOCKS in 4.10.3.3.1 Modules and instancesA module is a proc which takes two kinds of arguments:Interfaces, declared in the DIRECTORY list. These arguments are supplied by the model (oron the compiler's command line),Instances of interfaces, declared in the IMPORTS list. These arguments are also supplied bythe model (or in a config file passed to the binder, or implicitly by the loader).3.3.3 discusses the types of these arguments and how they are declared. In addition, animplementation may take PROGRAM arguments declared in the drType following PROGRAM orMONITOR. These are ordinary values; they are discussed in 3.3.2.1.When a module is applied to its arguments, the resulting value isFor an interface module, an interface.For an implementation module, a binding whose values are instances: one interface instance for each interface it exports;one for the program instance, also called a global frame;one for the program proc derived from the module body (3.3.2.1), calledCONTROL.This application cannot be written in the program, only in the model; it is described in 3.3.2.An interface (sometimes called an interface type) is a type, as the latter name suggests. This type is adeclaration (obtained from the declarations which constitute the module body), with an extendedcluster that includes all the bindings in the module body that don't use declared names (3.3.4). Inthe example, the Buffer interface (obtained by applying the Buffer module to the arguments declaredin its DIRECTORY) has declarations for New, Get, and Put, and its cluster includes values for Handleand BufferObject.An interface instance is a value whose type is an interface; such values are the results ofinstantiating implementation modules. In the example, BufferImpl returns (exports) an instance ofBuffer.A program instance or a global frame is a frame, as the latter name suggests, i.e., a binding obtainedfrom the bindings and declarations of an implementation (PROGRAM or MONITOR) module body,just like any proc frame (3.3.5). Normally code outside the module does not deal with the instancedirectly, but only with the exported interface instances. In the example, BufferImpl exports aprogram instance for the module and a CONTROL proc. f2t;G?q _(]rqrq[YWOtqtqtqT0R+tqPGtq P}PGq MKJtq Hltqtq DmrX ABq5>r qtq& t=fq;rqtq+9t|t8q 72+' 5 tqtq 4*tq< 0A.rq,OD*>5(- rqr q&  6$tq "@] rqr q8  H  &8  xqxq! tqxqxqxq%x qx q V rq7  x q! Nxq #rqr qB 5tqtq L 7 x q tq TVk(eMODULES3.342In most cases, there is:Exactly one application of each module, and hence exactly one interface or one instance. Only one module which exports an interface.Only one interface exported by a module. Only one argument of the proper type for each module parameter (3.3.3); hence it isredundant to write the arguments explicitly. When these conditions hold, there is a close correspondence among the following four objects: an interface module;the interface it returns (since its arguments need not be written explicitly); the implementation module which exports the interface;its instance (again, since its arguments need not be written explicitly).The distinctions made earlier in this section then seem needless; it is sufficient to simply considerthe interface and implementation modules, and identify them with the files which hold their text. Inmore complicated situations, however, it is necessary to know what is really going on.In the example at the start of this section, BufferImpl is an implementation module with sevenparameters:Four interface parameters, declared in the DIRECTORY: Rope, CIFS, IO and Buffer.Three instance parameters, declared in the IMPORTS: Files (of type CIFS), IO (of type IO),and Rope (of type Rope). Since the instance parameters are declared in an inner scope, theinstance Rope is the one visible in the module body; the interface Rope is visible only in theheader. The same is true for IO, but both the interface CIFS and the instance Files arevisible in the body.When BufferImpl is compiled, the four interface parameters must be supplied, in the form of(compiled) interface modules named Rope, CIFS, IO and Buffer. When BufferImpl is instantiated(normally by loading it), the three instance parameters must be supplied, i.e. there must be otherinstantiated implementation modules which export the Rope, CIFS, and IO interfaces. Normallythere will be one of each, and the entire program will consist of eight modules: the interface modules Rope, CIFS, IO and Buffer;implementation modules normally named RopeImpl, CIFSImpl, IOImpl and BufferImpl, eachexporting an instance of the corresponding interfaceThe instantiated BufferImpl exports an instance of Buffer, which can then be used as a parameter bysome other module.3.3.2 Applying modulesA module is not applied to all its arguments at once. Instead, the arguments are supplied in twostages:A module is applied to its interface (DIRECTORY) arguments by compiling it; the result is aBCD (represented by a .bcd file). The bcd is still a proc, with instance parameters. Like anyproc, a module can be applied to different arguments (i.e., different interfaces) to yielddifferent BCDs.A BCD is applied to its instance (IMPORT) arguments by loading (or binding) it; the result isa program instance, together with any interface instances exported by the module. Again,the BCD can be applied to different arguments (i.e., different interface instances) to yielddifferent instances. Indeed, because an instance may include variables, even two applicationsto the same arguments will yield different instances.These two stages are separated for several reasons:f2tF{ q _]Y[++X)V{KT$ R^QOON6LI K P I&; HR D x q' CT @'tqxqxqxqxq>tqxq xqxq xq= xqxqD;xq6xq: xqxqxq8 6<x qI 4 xqxqxqxqx q 34I 1 $xqxqxq 0,L-xqxqxqxq+| xqxqxqx q)+ ' x qxq* & "rX qF n tq,tqxq xq4Stq2tqtq#P*tqO>"3 3 VTVk(3.3.2 APPLYING MODULES43All the type-checking of a module can be (and is) done in the first stage, by the compiler.The only type error possible in the second stage is supplying an unsuitable argument.Compiling is much slower than loading, and a module needs to be recompiled only whenits interface arguments change, not when the interface values change. The latter are changesin the implementations of the interfaces, and are much more common.When there are multiple instances of the same module with the same interface parameters,they automatically get the same code.We've always done it that way.3.3.2.1 Initializing a program instanceThe statements in the body of an implementation module form the body of a proc called theprogram procedure. The function of this proc is to initialize an instance of the module. A programinstance PI may be uninitialized, because no code in the module is executed when the instance ismade. It is the job of the program proc PP( to initialize PI, perhaps using the PROGRAM argumentsif there are any. Until PP( has been called, PI is not in a good state. It would be better to supplythe PROGRAM arguments along with the imported instances, and call PP( as part of making PI, sothat PI is never accessible in its uninitialized state. But it isn't done that way; hence theprogrammer must ensure that PP( is called before any use is made of PI. To confuse things, PP( isnot an ordinary procedure but a PROGRAM, and it must be called using the START construct (see4.4.1). Note that in addition to the statements of the module body, PP( also contains the type-specific initialization code for any variables or non-static values in the instance; e.g., if x: INT_3, thevalue of x will not be 3 until after PP( has been called.There is some error detection associated with this kludge. If a proc in the instance is called beforethe instance has been initialized by START, a start trap occurs. At this point, if PP( takes noarguments it is called automatically, and the original call then proceeds normally; if PP( does takearguments, there is a Runtime.StartFault ERROR.Caution: If the module is a monitor, PP( runs without the monitor lock; if another process calls intothe module while PP( is running, it will not wait, but will run concurrently with PP(. This isunlikely to be right. It is unwise to rely on a start trap to initialize a monitor module; call PP(explicitly with START.Caution: If a variable in the instance is referenced before the instance has been initialized, no erroris detected, and the uninitialized value will be obtained. PP( can still be called to initialize theinstance, and may still be called automatically by a start trap.The program proc is bound to the name CONTROL in the result of an implementation module if itstype is PROGRAM[] RETURNS[] (otherwise the proc Runtime.ReportStartFault is bound to CONTROL).This allows the modeller (and binder) to get access to PP so as to control the order in whichmodules are started.3.3.3 Parameters to modules: DIRECTORY and IMPORTSThe interface parameters of a module are declared in the DIRECTORY. An interface I has type TYPEn, where n is any one of the names given before DEFINITIONS in the header of the interface modulethat produced I. The INTERFACETYPE primitive in the desugaring takes a list of names and returns atype which implies TYPE n for each n in the list. The reason for allowing several names is to aidconversion of an interface from one name to another; both names can continue in use for a while.The use of these names provides a clumsy check that the proper interface is supplied as anargument. DIRECTORY n: TYPE and DIRECTORY n are both short for DIRECTORY n: TYPE n.An interface is a type which can only be used: f2tG5G?q_F^WR[0Z{M XAVAU!R NrX KqR Jr qG Hxq!4 G xzqxqtq Exzqxq5 Dtq7xzqxq Bxq4" @ xzqxqxzq ?ytq!tq ==xzq xqtq :xqxzq 7R 6>"tqr qxzq 4@ xzq 36 xqtq 0 rqxzq" . xzq"xzq -'1xz +q tq (TrqI &9xzq& %L7 "! tq1 tqtqxq tq 3xq  rX|r| kq6tqxq t xqxq&t q& c xqu q@ tqxq xq= [ A S Stqxqtqtqxqtqxqtqxq (.TVk(MODULES3.344Before a dot (4.14), to obtain a value from its cluster, which simply consists of thebindings in the interface module body (3.3.4).In an IMPORTS list as the type of an instance parameter to a module. The USING clause in the DIRECTORY, if present, restricts the cluster of the interface to contain onlyitems with the names nu, ... Thus in the example, only ROPE and Compare are in the cluster of Ropein the BufferImpl module. This means that Rope.ROPE and Rope.Compare are legal, but Rope.n for anyother n will be an error. Note that USING affects only the cluster of the parameter; it does not affectthe clusters of any types or the bodies of any INLINE procs obtained from the interface. Thus withinRope, Compare might be bound byCompare: PROC[r1, r2: ROPE] RETURNS [BOOL]~INLINE {IF Length[r1]~=Length[r2] THEN ... }A call of Rope.Compare in BufferImpl is perfectly all right, even though Rope.Length in BufferImplwould be an error.In the example, CIFS, IO, and Rope are interfaces. They are the types of three IMPORTS parametersnamed Files, IO, and Rope (if the IMPORTS clause gives no name for the parameter, the name of theinterface is recycled). An actual argument for an IMPORT parameter must be an interface instance,i.e., a value whose type is an interface type. Such a value is obtained from one or more moduleswhich export the interface (3.3.5). An instance is a binding; in this binding the value of a namedeclared in the interface is provided by the exporter; the value of a name bound in the interface(e.g., x~3) is just the value that the interface binds to the name (in this case, the value 3). This rulehas two effects:The client can ignore the distinction between names bound and declared in the interface,since both appear in the instance binding and are referenced uniformly with dot notation.This means that the client is not affected, for example, when a proc is moved from anINLINE in the interface to an ordinary definition in an implementation.The client can often ignore the distinction between the interface and the instance, since allthe values in the interface are also in the instance, with the same names. This is themotivation for the shorthand which allows the name of an IMPORT parameter to default tothe name of the interface; the interface is no longer accessible, but I.x has the samemeaning (namely 3) whether I is the interface or the instance.Anomaly: Nmaes bound to inline procs in an interface do not appear in the interface binding, butonly in an instance. This somewhat duboius rule ensures that clients won't have to add to theirimports lists if a proc stops being an inline.Restriction: An interface module may not import more than one instance of a given interface I. If an implementationmodule P imports more than one instance of I, the principal instance of I is the one with no name in the IMPORTS list(which is therefore named I by default). If P imports only one instance of type I, then that instance is the principalinstance.Restriction: Often an interface module has no IMPORTS, because it only needs access to the static values (types andconstants) bound in its interface parameters, and does not need values for any names declared there (procs and interfacevariables). If an interface module does have IMPORTS, however, and there is more than one instance of any importedinterface around, then there is a restriction on the argument values. Suppose that Int1 imports Int2, and that a programmodule P imports Int1. Then Int1 may only import one instance of Int2, and if P also imports Int2, the principal instanceof Int2 in P must be the same as the value of Int2 imported by the Int1 imported by P. For example, withDIRECTORY Int2; Int1: DEFINITIONS IMPORTS Int2V: Int2 ...DIRECTORY Int1, Int2; P: PROGRAM IMPORTS Int1V: Int1, Int2V: Int2 ...we must have in P that Int1V.Int2V=Int2V.3.3.4 Interface module bodiesThe body of an interface module I is a collection of bindings (e.g., x: INT~3) and declarations (e.g.,y: VAR INT or P: PROC[a: INT] RETURNS [REAL]). Restriction: Only certain things may follow the ~ in one of the bindings13:f2tF{ q_M^W([tq8 Ytqtq* X#xW{X#qtqxqx V5qx qxqtqx qxq Txq tq> S-,tq( QxqxqyP%xOtqxqXxqtqtqtqtGNqXxqxqxqxqtG Mqx qx q%x qx Kq IA xqxqxq&tq Gxqxqxqtq, F9)tq DG C1rq; AH @)xq(9 >  B{t {t {t{t{t"{t {t {t l{t{t"{t{t {ty.tG{t{t tt{t{t{tyt{t{t{ttt{t{t{t{t{t{t  {t{t{t{t{ rX qxq#xqtq xqtqtqxqtqxqtqtqtq r q= } qTVk(`3.3.4 INTERFACE MODULE BODIES45If it is an expression, it must be static (3.9.1).If it is a block (providing the body of a proc), it must be INLINE (because there isn't anyplace to put the compiled code).It may not be CODE.The result of applying an interface module is an interface (3.3.2), which is a type I obtained byapplying the primitive MKINTTYPE to the d's and b's of the body. This type is simply the declarationobtained by collecting the declarations in the body, with a cluster which is extended to include allthe bindings of the body. However, MKINTTYPE omits any inline proc bindings from the type'scluster, instead leaving the proc declarations in I. It puts an extra item BINDING in I's cluster withthe inline procs in it. When an instance Inst of I is imported, the binding actually imported is InstPLUS I.BINDING. This slightly dubious arrangement ensures that clients don't have to change importslists if a proc stops being inline. This policy is not extended to other items, however, even thoughthey might change from being bound in the interface to being interface variables.The interface returned by Red, Blue, Green: DEFINITIONS~...has the types TYPE Red, TYPE Blue, and TYPE Green.The types and expressions in declarations and bindings may refer to other names in the bindings asusual, but they may not refer to names introduced in the declaration, except that:Any declared name may be usedin the body of an INLINE, or after a "_" in a defaultTC55 in the fields43 of a transferTC41 which is the type of adecl in the interface's body.A declared (opaque) type may be used anywhere.For example, if an interface containsI: DEFINITIONS~x: INT~3;y: VAR INT;T: TYPE[ANY]then the following may also appear in the interface:xx: INT~x+1;P: PROC RETURNS[INT]~INLINE {RETURN[x+y]};Q: PROC [INT_y];V: TYPE~RECORD[f: REF T, g: U]but the following are illegal:xy: INT~y+1;U: TYPE~INT_y;W: TYPE~ARRAY [0..y] OF INT;The values of the bindings can be accessed directly by dot notation in any scope in which theinterface value is accessible. Thus if the value of the previous interface module is bound to J, e.g.,because J: TYPE I appeared in the DIRECTORY, then J.x is equal to 3. The declarations cannot beaccessed directly (J.y is an error). The declarations in an interface module are not quite like ordinary declarations. They are of threekinds, depending on whether the type of a declaration is:A transfer type; this is just like a declaration of a transfer parameter to an ordinary proc,except that it is readonly. TYPE[ANY] or TYPE[e]; the type being declared is an opaque type or exported type, discussedin 4.3.4. The expression e must be static. TYPE[ANY] or TYPE[e] is not allowed in anordinary declaration; except in an interface, a type name must be bound to a type valuewhen it is introduced. f2tG1G?q_2]"tq[Y tq V|Rxq T uqD StY Quq/ Plxquqxq N&xqxq+x MduqxquqM K=" J\M G1 ExO qXt q D) tqxtqxqtqxq @*5 ?z rq;=";tq9 9F}9q 9F}9q9F}9q7|5$. 2% 1HxqXt q /xqtq .@xqtGq ,xqXtqtq +80 )xqXtqxq (0xqtqtqtqtqtqxqxq &xqtqtqxq %(xqtqtqxqtGxqXxqxq # " xqXtqxq  xqtqtqx qtqtqxqtGq 0* i'.xq xqtqxqtqxqxq* a xq 6$< 3Z/- ~tqtqtqxq!rqr q  xqtqtqtqxq v6 TVk(MODULES3.346VAR T, or READONLY T for any type T except TYPE; this is an interface variable; discussedin 3.3.4.1 below. In an ordinary declaration in a block you can't write VAR T, but mustwrite simply T; you can also write simply T here, but this is not recommendedAn interface instance II has the interface type I if for each item n: T in the interface, there is anitem n~v in the instance, and v has type T. This is the same rule which determines that a bindinghas the type of a declaration; e.g., that a proc argument has the domain type. In this respect there isnothing special about an interface.Note that a name can be declared PRIVATE in an interface, even though it must be declared PUBLICin the exporter (3.3.6). This can be useful if the name is used in a type constructor or inline procin the interface, but its value should not be accessible to the client.3.3.4.1 Interface variablesAn interface variable v gives clients of an interface direct access to a variable in a program module,namely the variable which is exported to v. This is the only kind of variable parameter in currentCedar.If you use the obsolete shorthand of T for VAR T in an interface variable declaration, you cannotdeclare a transfer type variable as an interface variable, since that already means passing the transfervalue.Caution: the variable which is exported to provide the value for an interface variable is notinitialized until its module is initialized (3.3.2.1). However, there is nothing to stop it from beingaccessed sooner.Performance: An interface variable can be read and (if not READONLY) set directly, which issignificantly faster than Get and Set procs. Of course, the implementor gives up some control. It isnot quite as fast as access to an ordinary variable, since there is an extra level of indirection whichcosts one or two extra instructions each time. There is also one pointer per interface variable permodule which refers to it. If you use a private interface variable and inline Get and Set procs, youpay nothing in performance, but retain the option of changing the proc definitions later.You can get direct access to all the variables of a module by using a POINTER TO FRAME type(4.5.3).3.3.5 Implementation module bodiesThe body of an implementation module Imp is simply a block. This block plays two roles. On theone hand, it is an ordinary block, the body of an almost ordinary proc PP( called the PROGRAMproc, which has parameters and results like any other. PP( is special in one way: it has a PROGRAMtype rather than a PROC type. When PP( is applied (using the special construct START; see 4.4.1),its declarations and bindings are evaluated, its statements are executed, and its results are returnedas with any proc. The only difference is that the values bound to the names introduced in the block(i.e., the frame of PP() are retained after the proc returns; in fact, forever (unless Runtime.Unnew isused to free the frame). Procs local to the block can access these values in the usual way, and valuesof exported names can also be accessed through interfaces, as explained below; see 3.3.2.1.As with any proc (3.5.1), the frame of PP( includes the parameters and results from Imp's drType42as well as the names introduced in the block's d's and b's. It also includes an additional item:Imp: PROGRAM T~PP(where Imp is the name of the module and T is its drType.The body of Imp has a second role: to supply values for the names declared in the interfacesexported by Imp. For each interface Ex which Imp exports, an interface value ExI of type Ex isconstructed. Each name n in ExI acquires a value as follows:f2tF{ q_tqxqtqxq xqtq rq ^WHtqxq \xqxq" Yxqxqxqxq X$xqxqxq xq7 V>& U Qtq2t PmqI NE JrX Gqxq1 F;#xq2 D A#xqtqxq1 @_ > ;Yrq4! 9 S 8Q 5&r q/tq 3 xqxq? 2^ 0 S / ;xqxq -V *g?tqtqtq ( $rX !qxq6 5Dxzqt q2xzqt -qtq xzq)tq Z %+6  xzq)x q [ [ n'xzqxq} q^yfxqXtqxqxz qxqxq xq,! 3xq xqxqxqxq  xqxq@TVk(3.3.5 IMPLEMENTATION MODULE BODIES47If n: T is in Ex and n: PUBLIC T~x is in the body of Imp, then n~x is in ExI. This is aslightly peculiar kind of binding; as in an ordinary binding, x must be coerceable to T(4.13). Note that n must have PUBLIC access (3.3.6) in the body.If n is declared in Ex and not bound in the body of Imp, then n~UNBOUND is in ExI.UNBOUND is a special value with the following properties:For a proc P, it causes a Runtime.UnboundProcedure signal on any application of P.For a variable v, it causes a Runtime.PointerFault error on any reference to v.For a type T, it causes no problem.If n~x in Ex, then n~x in ExI. Thus any names bound in the interface are bound the sameway in any interface value.Caution: A name can be exported to several interfaces without any warning, if it has a suitable type.This is unlikely to be what is wanted.The result of instantiating Imp is a binding with:One item for each exported interface Ex, namely Ex: Ex~ExI, where ExI is the interfacevalue constructed above. Here Ex is the name nd given to the interface in the DIRECTORY.One item CONTROL: PROGRAM[] RETURNS [], whose value is the program proc PP( if thathas no arguments, and otherwise Runtime.ReportStartFault.One item for the type of the module's global frame, namely FRAME~TYPE Imp. One item for Imp itself, namely Imp: FRAME. The value of this item is the programinstance, i.e., the frame of the module's body. This binding is accessible in a model, where it can be used to get access to the interface instances,the program proc, the global frame type, and the program instance. You can pass FRAME as an argument to a DIRECTORY parameter I: TYPE Imp; like an interface; Iprovides access to constants bound in the module, and allows you to declare an IMPORTS parameterwhose argument will be a program instance of the module. From I you can also obtain a first-classCedar type POINTER TO FRAME[I]; see 4.3.5. I's cluster includes a coercion from I to POINTER TOFRAME[I], and the proc COPYIMPLINST (applied by the funnyAppl NEW), which is the same as theproc of the same name in cluster of POINTER TO FRAME[I].You can import Imp into another module (by writing DIRECTORY Imp ... IMPORTS ImpInst: Imp ...),and obtain access to all the variables and procs of the program instance.3.3.6 PUBLIC, PRIVATE and SHARESCedar has a rather complicated mechanism for controlling access to names. Most uses of it are nowconsidered to be obsolete, with the following exceptions:Names to be exported must be declared PUBLIC.Names included in an interface for use in inline procs etc., but not intended for use byclients, should be declared PRIVATE.Access to a name is declared by writing PUBLIC or PRIVATE right after the colon in a declaration ofa name:x: PUBLIC TIn the Cedar syntax these colons occur in the declarations11 and bindings13 in bodies10, fields43,51,and interface modules2, and in the tag53 of a unionTC. You can set a default access for all thenames in a module2, 3 or record50 by writing PUBLIC or PRIVATE just before the { or RECORD; this isoverridden by an explicit PUBLIC or PRIVATE inside. By default, an interface is PUBLIC and animplementation is PRIVATE. f2tG- G?q_xqxqxqxqtqxqxq xqxqxqxq ^W6xqx\q xq tqZ{xqxqxqxquqxqXuq2V xq xqxqTxq xqxqR xqPlxqxqxqxqxqxq'N KrqF J9" GxqD!xqxqxqxqxqC2xq xB{C2qtq@tqtqtq%xzq?Vxq<