[_CD8_]<cedardocs>Language>MesaToCedar.doc!1

MesaToCedar.Doc

Moving from Mesa to Cedar Jim Morris

Introduction

This section assumes you already know how to program in Mesa (or that you have a Mesa program to be converted), and is intended to explain the differences for programming in Cedar.

Let's suppose you want to run a simple program in Cedar. If an existing Mesa 5 or 6 program uses fairly vanilla stuff, it's easy to convert: the names of most interfaces and some procedures have changed, but the functionality is basically the same. The most obvious differences will be with strings and I/O. You should only need to know about two interfaces for these: Rope and IO, respectively. In general, the Cedar/Pilot community has dropped the use of "Defs" as a suffix for definition file names, and introduced the suffix "Impl" for implementation files; e.g. "InlineDefs" became "Inline". Here's what you need to do to your Mesa 5 or 6 program:

(a) Change all STRINGs to ROPEs (actually Rope.ROPE). Remove all allocations and deallocations of strings. Change all references to StringDefs routines to use Rope or IO routines. Rope provides procedures to parse and manipulate ropes. IO provides procedures to convert ROPEs to numbers and back as noted below. One can now put special characters in rope literals by using the escape character "\". ". . . \n . . ." inserts a carriage return (newline), ". . . \t . . ." a tab, ". . . \\ . . ." a backslash, and ". . . \123 . . ." the character whose octal code is 123. Note a ROPE is immutable, unlike a string. Appending a character creates a new ROPE.

(b) You should use INTs (32-bit integers) for all your numbers, unless you know you don't need that big a number and know you need efficiency. In those cases use INTEGER or NAT = [0..77777B]. Avoid using CARDINALS or LONG CARDINALS; their main use is in dealing with STRINGs. The compiler recognizes the abbreviation INT for LONG INTEGER, BOOL for BOOLEAN, CHAR for CHARACTER, and PROC for PROCEDURE.

(c) Change all references to IO packages of all kinds (streams, files, TTY) to use equivalent IO routines. IO is the only interface you should need to know about for I/O of almost any type of variable or constant (ROPE, INT, etc.) to almost any type of device (keyboard, display, files, temporary buffer etc.). IO contains:

(i) A set of CreateX routines for each kind of stream Xfile, display, etc.

(ii) A set of GetX routines for each type X (integers, ropes, etc.)

(iii) A PutF routine that can be used with any type (integers, ropes, etc.) via a set of inline procedures (int, rope, etc.) which are used to tag the type of the arguments. It also provides a format argument which may be used to get FORTRAN-style formatting of output. For example, the format "%g" prints almost anything in default free-format:

stream.PutF["The sum of %g and %g is %g.\n", int[x], int[y], int[x+y] ]

(iv) A PutFR routine that is identical to PutF except it produces a rope as output instead of putting its result on a stream, and a RS routine that makes a rope look like a stream so that the GetX procedures can be used. Thus one can convert various types to and from ropes, e.g. the following code which converts an integer to a rope and back:

r: ROPE← PutFR[, int[i]];

j: INT← GetInt[RS[r]];

(d) Make use of LISTs and SEQUENCEs instead of ARRAYs and DESCRIPTORs for ARRAYs. The interface List contains some useful routines.

The changes in the Cedar language from Mesa 6 [L1, L2, L3] are fairly easy to understand for simple programs:

(a) REFs provide automatic deallocation and easier allocation:

Node: REF Rec;

Rec: TYPE = RECORD[first: INTEGER, rest: Node];

. . .

x: Node ← NEW[Rec ← [5, NIL]];

(b) Runtime types via REF ANY give looser binding:

TNode: REF BlRec;

Node: REF B2Rec;

x: Node ← . . . ;

t: TNode ← . . .;

q: REF; a REF to any record

. . .

q ← t; q ← x; both of these are legal

t← NARROW[q]; raises NarrowRefFault if q is not a TNode

q^ ← E is always illegal. You cannot update through a REF ANY.

. . .

type can also be checked explicitly:

WITH q SELECT FROM

m: TNode => {t ← m; q ← m.lson};

n: Node => {x ← n; q ← n.rest};

ELSE ERROR;

IF ISTYPE[q, TNode] THEN {t ← NARROW[q]; q ← t.lson}

ELSE IF ISTYPE[q, Node] THEN {x ← NARROW[q]; q ← x.rest}

ELSE ERROR.

REF ANY is preferred to the use of variant records.

Node: TYPE = LIST OF INT;

x: Node ← CONS[5, NIL];

y: Node ← LIST[5, 6]; same as CONS[5, CONS[6, NIL]]

i: INT← y.first; i is 5

z:Node← y.rest; z is CONS[6, NIL]

FOR l: Node ← y, l.rest UNTIL l=NIL DO. . .

(d) Ropes, Atoms, Sequences, and Ints are also built-in. See [L3].

(e) To protect yourself and the garbage collector from obscure errors you should program in the safe subset of the language. See [L4] for a complete description of the safety restrictions. To get a program into the safe subset prefix each PROGRAM, MONTIOR, and DEFINITIONS module with the word CEDAR. The compiler will then tell you when you are straying outside the safe subset. You can wave the compiler off any block by placing the word TRUSTED before it. If you call a procedure from an unsafe interface (i.e. one that doesn't start with CEDAR DEFINITIONS), the compiler will complain unless the call is in a TRUSTED block. As of the 3.2 release only a few basic interfacesRope, IO, etc.are safe, so you should probably wait until a later release to convert programs that import many other interfaces; otherwise, you will have to plaster your code with TRUSTED's that will be unnecessary later.

Restrictions of the safe language

The @ operator is not permitted. There are three general ways to cope with this restriction: specializing, copying, and indirecting. For example, suppose you have a program that says

W: ARRAY [0..100) OF Z;

P[@W];

FOR i IN [0..100) DO . . . Q[@W[i]] . . . ENDLOOP;

To eliminate the first @ by specializing we would make a copy of the procedure P that dealt with the W directlynot very satisfactory. To eliminate the first @ by copying we would pass the array W in by value and back by resultalso not very satisfactory. It is best to deal with the first @ by indirecting; just allocate W from collectable storage, writing

W: REF ARRAY [0..100) OF Z = NEW[ARRAY [0..100) OF Z];

P[W];

Eliminating the second @ by specialization is plausible if Q knows it is always dealing with array elements: pass a reference to W along with an index. Otherwise, deciding between copying and indirecting depends upon the size of a Z. If it is small copy it, writing "W[i] ← Q[W[i]]". If it is big create references to it and pass those, writing

W: ARRAY [0..100) OF REF Z;

P[W];

FOR i IN [0..100) DO . . . Q[W[i]] . . . ENDLOOP;

The form of variant record discrimination that does not copy the value to a new location cannot be used. Suppose you have a variant-record data structure like

T: TYPE = REF TR;

TR: TYPE = RECORD[SELECT t:* FROM

name, string => [x: ROPE];

link => [i: INT, r: T];

ENDCASE;

and are accustomed to performing discriminations like

e: T;

WITH x: e^ SELECT FROM

name, string => S0[x];

link => {S1[x.i]; S2[@x]};

ENDCASE;

You should declare a set of REFs to bound variant types like

Name: TYPE = REF name TR;

String: TYPE = REF string TR;

Link: TYPE = REF link TR;

and rewrite the discrimination to be

WITH e SELECT FROM

x: Name => S0[x];

x: String => S0[x];

x: Link => {S1[x.i]; S2[x]};

ENDCASE;

The type of x is now a REF type, not a TR, so varior other types need to be adusted and the @ in S2 is no longer needed. If S0 were a large block, you will probably need to introduce a procedure to avoid copying it.

Variant records cannot be overwritten. Similiar techniques can be for sanitizing a program that overwrites variant records. Assuming the declarations of T and TR from above, suppose you wanted to write

x: T ← NEW[TR ← [name["END"]];

x^ ← [link[5, x]];

The specialization/copying technique is to simply update the thing that points at the record, writing "x ← NEW[TR ← [link[5, x]]]". However, if you don't know all the places that point at the record, you must introduce another level of indirection, writing

T: TYPE = REF REF TR;

x: T ← NEW[REF TR ← NEW[TR ← [rope["END"]]];

x^ ← NEW[TR ← [link[5, x]]];