TreeDoc.tioga, by Russ Atkinson and Ed Satterthwaite Last edited by Atkinson, April 3, 1980 2:37 PM Last edited by Satterthwaite, August 23, 1983 4:53 pm This document is a rough attempt to describe the nature of the tree structure that is used by the first four passes of the Cedar compiler. It assumes familiarity with Cedar. Pass 1 of the Cedar compiler perfoms the initial parsing of text into an uninterpreted tree. Syntax errors are handled by this pass. As the parsing proceeds, names and literals are turned into appropriate entries in various tables that the compiler keeps. The resulting tree consists of nodes that indicate the structure of the program and leaves that are indices into the tables. Pass 2 builds semantic table entries for the definitions of identifiers, builds body table entries for the procedure bodies and blocks, and performs other minor bookeeping functions. Pass 3 binds identifiers, computes and checks types, supplies (most) coercions, and collects assorted attributes. Pass3P is a post-processor for pass 3; it expands calls of inline procedures, using substitution strategies chosen according to attributes computed by pass 3. Pass 4 does constant folding, rechecks types (with more complete subrange information), lays out frames and records, and collects assorted attributes. Pass 5 generates the obect code. In order to discuss the tree, we need to show its representation in Cedar terms. The following definitions were taken from [Indigo]Bcd>Tree.mesa: -------------------------------------------------------------------------------- Link: TYPE = RECORD [ SELECT tag: * FROM subtree => [index: Tree.Index], -- index to the tree node table hash => [index: Symbols.Name], -- index to the symbol hash table symbol => [index: Symbols.ISEIndex], -- index to the semantic entry table literal => [info: Literals.LitIndex], -- index to the literal table ENDCASE]; Node: TYPE = MACHINE DEPENDENT RECORD [ free (0: 0..0): BOOL, -- reserved for allocator name (0: 1..8): Tree.NodeName, -- the kind of tree node attr1 (0: 9..9), attr2 (0: 10..10), attr3 (0: 11..11): BOOL, -- kind-dependent fields shared (0: 12..12): BOOL, -- TRUE iff node is shared nSons (0: 13..15): [0..maxNSons], -- number of sons info (1): Tree.Info, -- kind-dependent field son (2): ARRAY [1..1) OF Tree.Link]; -- the sons (a fake sequence) Info: TYPE = UNSPECIFIED; -- usually a source index or a type maxNSons: NAT = 7; -- max # of sons (except for lists) Index: TYPE = Table.Base RELATIVE POINTER[0..Table.Limit) TO Tree.Node; -------------------------------------------------------------------------------- The descriptive format we will use is similar to Cedar syntax for declaring variant records, in that the name will appear as a variant tag and will be followed by a number of named fields. This is only a convention. No such variant structure is actually used in the data structures representing the tree. Instead, the correspondence between the fields described below and the sons of a node is strictly positional. Where the attributes or information fields have special significance, their descriptions follow the descriptions of the sons. We will use the names of nodes to characterize node kinds; think of these as types in a somewhat looser sense than Cedar types. There are several places where unions of node kinds are used. The major divisions below define the common unions, namely type, stmt, expr, and range. For other unions, the notation X | Y means that either node kind X or node kind Y is acceptable. In most cases, a single kind of node has a fixed number of sons. Exceptions are noted in the descriptions of the individual nodes. As used below, an id is a Tree.Link where the tag is generally $hash between passes 1 and 2 and is generally $symbol after Pass3. A literal is a Tree.Link where the tag is $literal, and a node is a Tree.Link where the tag is $subtree. The notation sourceLink means the index in the source stream of the leftmost symbol of the construct. ******************************************************************************* The following nodes support general purpose constructs in the parse tree. ******************************************************************************* list => [ ANY* ] A list may have from 0 to almost any number of elements, but the representation of lists specially encodes most of the common cases. Usually, a list of 0 items is Tree.Null, a list of 1 item is the item itself, and other lists are represented by a list node, where the sons are the list items. Further, small lists have their lengths directly in the list node, while large lists do not (nSons = 0, and a special end marker is used). For these reasons, consider using TreeOps.ScanList and TreeOps.UpdateList to access lists of values. Lists are normally homogeneous. As used below, the node kind of a homogeneous list of nodes of node kind X will be written list[X]. item => [ tag: ANY, value: ANY ] info: sourceLink (for values that are stmt-like; not to be depended on otherwise) An item is a heterogenous pair of values, the first of which is often interpreted as a tag field, and the second as a value field. Details of items and lists of items will be given as the items are used. We will write item[tag: X, value: Y] to denote the node kind of an item that has a tag field of node kind X and a value field of node kind Y. none => [ ] There is one distinguished value of this node kind, Tree.Null, that serves as a NIL value and is often used to indicate an omitted option. ******************************************************************************* The following nodes support lambda bodies and decls in the parse tree. The attr2 and attr3 fields of nodes that have kind decl or are used in decl-like ways (such as the item nodes in an imports list) are used internally as mark bits for the tree traversals done by passes 3 and 4. ******************************************************************************* module => [ directory: list[ diritem ], imports: list[ item[tag: id, value: expr] ], exports: list[ id ], shares: list[ id ], locks: lambda, body: decl ] attr1: unused (formerly, resident) info: sourcelink This node is used as the root node for a compilation. There should be only one such node in the syntax tree. The directory, imports, and exports come from the DIRECTORY, IMPORTS and EXPORTS clauses respectively. For each item in an imports list, the tag field contains an id that is the formal parameter identifier of an imported module instance, and the value field indicates the type of that instance. These will generally be the same id; if an explicit instance name appeared in the source text, attr1 of the item node will be set. The decl node in the body field is the declaration of the interface, program or monitor. The type appearing in that decl includes the run-time formals of a program or monitor; the compile-time (DIRECTORY) and load-time (IMPORTS) formals appear in the module node as described above. Only a monitor has a non-null locks field This node is discarded after pass 3; the tree for each body is subsequently located through the body table. diritem => [ name: id, (from: string | type: id), using: list[ id ] ] A diritem comes in two flavors, distinguished by the type of the second field. In the old form (second field named from), that son is a string literal; the node gives a formal name for a module type and a file name in which the module is to be found. This form is obsolescent (e.g., the modeller ignores any such file name). In the newer form (second field named type), the type field contains the identifier of the type of the interface; if it is empty, the formal name (in the name field) is assumed to be the type name. In either case, the using field is a restriction on the list of identifiers in the named module. If the list is empty, there is no restriction; i.e., all names in the binding provided by the module are visible. lambda => [ formals: list[ decl ], lock: expr ] attr1: implicitMonitorLock attr2: public This node encodes a nameless procedure that is to be invoked to compute the address of a monitor lock as specified in a LOCKS clause. The number of formals is restricted to 1 (LOCKS ... USING formal) or 0 (otherwise). If there is no LOCKS clause for a module that is a MONITOR, attr1 is set to indicate an implicit declaration of LOCK: MONITORLOCK in the module, and attr2 is the public attribute for that lock. body => [ opens: list[ item[tag: id, value: expr] ], decls: list[ decl ], stmts: stmt, locks: expr ] attr1: checked attr2: trusted attr3: inline info: bodyTableIndex (after pass 2) This node represents the body of a program or procedure. Each item in the opens list has an identifier (possibly anonymous, represented by Symbols.nullName) as the tag field, and an expression in the value field indicating the module or expression to be opened. The locks field is set by pass 3 for entry procedures; it is a copy of the locks clause for the module with any formals bound to the actuals appropriate for the particular body. It is used to generate the code for locking and unlocking the monitor upon entry and exit. decl => [ names: list[ id | item[ key: id, value: item[ position: expr, bits: range ] ] ], type: type, init: list[ expr | void ] | signalinit | body | inline | entry | internal ] attr1: immutable ("= initialization") attr2: public info: sourcelink A decl node represents a declaration. The names field must be a list of ids except in a machine dependent record declaration, where it can be a list of items in which the key is the name being declared and the value is a description of its position in the record. In a record declaration (including the decl of an argument or return record), the init field specifies the default value for the corresponding field (and would more properly be encoded as part of the cluster of the type of that field); see the description of typedecl for the interpretation of default exprs and void. Otherwise (e.g., in a body or block), the init field specifies initialization. If init = Tree.Null, there is no explicit initialization. If the init field contains an expr that is a compile-time constant and attr1 is set, the (folded) value of init is attached to the symbol table and the declaration is deleted by pass 4. Otherwise, the initialization is converted to assign[id1, ... assignx[idn, expr] ... ] by that pass. In addition to ordinary expressions, several special forms are allowed as initial values. A signalinit node is used to generate signals codes. A body node is allowed for a procedure or program type, in which case entry and internal nodes may appear above the body node (but only between passes 1 and 2, see below). An inline node is used for initialization of a procedure to a MACHINE CODE body. typedecl => [ name: list[ id ], type: type, default: list[expr | void] ] attr1: immutable (always TRUE) attr2: public info: sourcelink This node represents the declaration of a type. The value of the type identifiers is given by the type field, and any explicit default initial value for the type is given by default, as follows: default~Tree.Null T default~e T_e default~$list[] T_ default~$void[] T_TRASH default~$list[ e, $void[] ] T_e | TRASH void => [ ] This node is used to indicate empty initialization to an unpredictable value (TRASH). signalinit => [ ] This node represents the generator of a unique CODE value for a signal or error. inline => [bytes: list[list[expr]]] This node is used for a MACHINE CODE block. Each expression repesents a machine code byte; each sublist indicates the grouping of bytes into machine instructions. entry, internal => [body: body] These nodes encode the ENTRY and INTERNAL attributes of bodies. They are removed by pass 2, which moves the attributes into the corresponding body table entries. ******************************************************************************* The following nodes are used for types, and may appear wherever type appears as a kind of node. An id (including a predeclared id such as INT or BOOL) may also appear as a type. In certain constructs (such as NARROW), a type is optional; the corresponding type field may then be Tree.Null. After pass 2, the info field of each type constructor node (one with a name ending in -TC, except discrimTC or frameTC) contains the CSEIndex of the symbol table SERecord constructed to represent the type. Note that these SERecords are not linked together until pass 3 (by the first subpass of the declaration/type processing in Pass3D) and may have incomplete attribute fields until the end of pass 4. Each dot node used as a type expression is converted to a cdot or discrimTC node by pass 3 according to the type of the left operand. At the same time, the info field of the converted node is set to the SEIndex of the selected type. Forward reference problems must be handled especially gingerly while processing the selection of a discriminated variant record (see Pass3V). ******************************************************************************* basicTC => [code: expr] attr1: ordered, info: CSEIndex This node does not currently appear, but it is available so that collections of predeclared symbols, including the basic types, such as UnderType[INTEGER], can be generated from source text if desired. enumeratedTC => [elements: list[id | item[ tag: id, value: expr ]]] attr1: public -- for the ids attr2: machineDependent info: CSEIndex This node represents an enumeration type, such as {red, blue, yellow, green}. For a MACHINE DEPENDENT enumeration, the elements field may contain a list of items, where the keys declare the literals of the enumeration and the values specify the representations. implicitTC => [ ] info: CSEIndex This node represents a type designated by * in a variant record type constructor; it generates an enumeration type in which the literals have the same names as the tags of the variants. subrangeTC => [ type: ( type | none ), range: range ] info: CSEIndex This node represents a subrange type constructor, such as CHAR ['a..'z]. If no explicit type appears, as in [0..6], the type is a link to the predeclared symbol INT. recordTC, monitoredTC => [ elements: list[decl] ] attr1: machineDependent attr2: variant -- has an embedded union or sequence field attr3: painted info: CSEIndex This node represents a record type constructor. For a variant record, a unionTC or sequenceTC node occurs as the type of the last element in the list of decls. A monitoredTC node is generated for a MONITORED RECORD; it is just like a recordTC node except it implicitly declares LOCK: MONITORLOCK as the first field of the record (and that decl appears in both the tree and symbol table structures). The painted attribute is set unless the record type is one generated implicitly to describe a LIST node (see below); however, it is ignored in a program module unless attr1 or attr2 is set (this is a kludge to make module replacement practical). After pass 3, attr1 remains set only if the record type was specified to be MACHINE DEPENDENT and postions of the fields were specified explicitly. unionTC => [ tag: decl, variants: list[decl] ] attr1: machineDependent -- inherited from enclosing record or variant attr2: overlaid info: CSEIndex This node represents a union type constructor, which currently is restricted to occur as the type in the last field in a record type or (recursively) one of its variants. The tag field is a declaration for the discriminating tag; if the tag is COMPUTED, the id in that declaration is Symbols.nullName. The variants field is a list of alternatives with an entry for each variant. Such an entry has the form of a type declaration, typedecl[ list[id], variantTC[...] ], where the id(s) name the particular variant(s) and the variantTC is similar to a recordTC but describes only the fields in the particular variant (see below). After pass 3, attr1 remains set only if the enclosing record type was specified to be MACHINE DEPENDENT and postions of the fields were specified explicitly. variantTC => [ decls: list[decl | type] ] attr1: machineDependent -- inherited from enclosing union attr2: variant -- has an embedded union or sequence field attr3: painted info: CSEIndex This node represents the constructor of (the variant part of) a discriminated variant record type. It is similar to a recordTC except the decls describe only the fields in the particular variant; the common fields and tag do not appear. Pass 2 constructs a symbol table entry for a variantTC that is an SERecord.cons.record.linked with a linkType field that points to the enclosing record type. It is through this link that the common fields can be found. In the other direction, the enclosing record type has a final field that is a union, from which all the variants can be found. Note that the compiler assumes that it can find such a union field; use care in copying symbol table entries for variant record types. See recordTC for discussion of the attributes. sequenceTC => [ tag: decl, range: type ] attr1: machineDependent attr2: overlaid -- not meaningful attr3: packed info: CSEIndex This node represents a sequence type constructor, as in SEQUENCE length: NAT OF CHAR. Currently such a constructor may occur only as the type of the last field of a record (or variant thereof). See unionTC for a discussion of the tag field. Note that the type of the tag field implies the allowable domains of corresponding sequence values but should not itself be considered the domain type.. After pass 3, attr1 remains set only if the enclosing record type was specified to be MACHINE DEPENDENT and postions of the fields were specified explicitly. refTC, pointerTC => [ referent: type ] attr1: ordered attr2: basing attr3: readOnly info: CSEIndex These nodes represent pointer or ref type constructors, as in POINTER TO INTEGER. A refTC indicates a pointer into counted storage; attr1 and attr2 are always FALSE. Note that LONG POINTER TO T and REF T are represented by longTC[pointerTC[T]] and longTC[refTC[T]] respectively; i.e., a refTC always occurs as the operand of a longTC. (This was probably an unfortunate choice in the treatment of LONG.) varTC => [ referent: type ] attr1: ordered -- always FALSE attr2: basing -- always FALSE attr3: readOnly info: CSEIndex This node represents a VAR or READONLY type constructor. Such types are represented by pointers that are unconditionally dereferenced on each mention. They are currently legal only in interface modules and are always short (MDS relative) addresses of variables in global frames. Much of the machinery to support VAR parameters has been added to the Cedar compiler but is disabled. listTC => [ referent: type ] attr1: ordered attr2: basing attr3: readOnly info: CSEIndex The operator listTC is just like refTC except the referent of the type that it constructs is a LIST node. The listTC and corresponding recordTC nodes require special processing in pass 2, since they define a recursive type without introducing any type identifiers. See linkTC below. linkTC => [ ] info: CSEIndex This node represents the type of the rest field in a LIST node. The compiler deals with the recursion mentioned above by binding this node to the nearest enclosing listTC node. longTC => [ subtype: type ] info: CSEIndex This node represents application of the operator LONG to a type. It may be used to make long numeric (e.g., LONG CARDINAL), pointer, or descriptor types; note that is occurs implicitly for each REF type. arrayTC => [ domain: ( type | none ), range: type ] attr3: packed info: CSEIndex This node represents an array type constructor, as in ARRAY CHAR OF NAT or ARRAY OF CARDINAL. The domain is the index type and the range is the type of the elements. If the index type is omitted, domain = Tree.Null. This latter form was intended only for making array descriptors, but is is not prohibited elsewhere and the implied index type is CARDINAL. arraydescTC => [ referent: type ] attr3: readOnly info: CSEIndex This node represents a descriptor type constructor, as in DESCRIPTOR FOR ARRAY OF CARDINAL. The referent type must evaluate to an array type. zoneTC => [ ] attr1: uncounted info: CSEIndex This node represents the zone type constructor. Note that there is no constructor for the type of a zone dealing in short pointers; MDSZone is available only as a predeclared symbol. relativeTC => [ base: id, offset: type ] info: CSEIndex This node represents the constructor of a relative pointer type , e.g, Base RELATIVE POINTER TO T. The base type must be designated by an identifier (for syntactic reasons) and must evaluate to a pointer type. The offset type must be a pointer or array descriptor type; its referent type becomes the type of the relocated and dereferenced relative pointer, e.g, T in this example. procTC, processTC, signalTC, errorTC, portTC, programTC => [ domain: list[decl], range: list[decl] ] attr3: safe info: CSEIndex These nodes are used for the obvious constructs. All of them represent transfer constructs that behave more or less like procedures but have different implementations of the transfer operations. For procTC, signalTC and errorTC, the domain and/or range may be anyTC[], corresponding to PROC ANY, etc. definitionTC => [] info: CSEIndex This node is used to construct the type of the module identifier in an interface module. Its use parallels that of programTC in a program module. anyTC => [] info: CSEIndex This node represents the type constructor ANY, which may appear in REF ANY or as the domain or range of a proc or signal type. opaqueTC => [ size: expr ] info: CSEIndex This node represents an opaque type. If size = Tree.Null. the SIZE attribute of that type is not specified. paintTC => [ ? ] info: CSEIndex This node will be used to apply paint to an existing type, an operation that is not yet implemented. dot => [ left: type, selector: id ] This node results from T.N or (obsolescent) N T appearing as a type expression. It is converted to cdot or discrimTC by pass 3. apply => [ operator: type, args: expr ] This node results from T[x] appearing as a type expression. It either is converted to discrimTC by pass 3 (obsolescent) or represents the result of applying a sequence-containing record type to a value that fixes the number of elements in the sequence, e.g., TEXT[n+3]. cdot => [ interface: type, selector: id ] info: Symbols.Type This node represents the selection of a type constant from an interface type or instance. The info field is set by pass 3; it is not necessarily a CSEIndex. discrimTC => [ left: type, selector: id ] info: Symbols.Type This node designates a discriminated record type, e. g., T.V (or V T or T[V]). The info field is set by pass 3; it is a Symbols.ISEIndex for the identifier introduced by the type declaration for the particular variant with id matching selector (see above). frameTC => [name: id] This node results from FRAME [ Program ]. It abbreviates a pointerTC with a referent type that is the record type of the global frames for the corresponding program module. ******************************************************************************* The following nodes are used for statements and may appear wherever stmt appears. The info field of each node is a sourceLink. Each pass of the compiler stores the sourceLink into a global variable, where it is available to reporting errors, etc., before processing a node, and restores the old value of that variable on exit. A list of statments is also an acceptable alternative for stmt. Such a list normally arises from a "compound statement," i.e., a block without local declarations. ******************************************************************************* assign => [ lhs: expr, rhs: expr ] attr1: initialization attr2: counted attr3: composite info: sourceIndex This node encodes an ordinary assignment statement, e.g., x _ y. If attr1 is set (after pass 4 or in an inline from an included module), the assignment was created by rewriting an initializing declaration. extract => [ lhs: exlist[list[expr]], rhs: expr ] info: sourceIndex This node encodes an assignment statement in which the lhs is an extractor. Extractors are treated specially because the type of the rhs establishes the type of the lhs (normally, the type of the lhs is the target type for the rhs and determines the coercions to be applied to the rhs). The lhs is converted by pass 3 to a list of assignments with empty rhs fields, i.e., the ith list element becomes assign[left~expri, right~Tree.Null]. Pass 3 sets the attribute bits in these nodes as described under assign. It also inserts an exlist node above the lhs list and stores the type of the rhs in the info field of that node. apply => [ operator: expr, operand: list[ expr | item[tag: id, value: expr] ], (catch: catch | )] attr1: argsBuilt info: sourceIndex This node is generated by the parser to encode any construct with the syntactic form of an application, op[args]. In a statement context, it is converted to one of the nodes below by pass 3 according to the type of the operator value. If the catch is Tree.Null, the third son of this node (and any node to which it is converted) is usually omitted. The second alternative for args encodes a keyword constructor. Pass 3 normalizes every argument list and converts it to positional notation by sorting the keyword items and supplying default values as needed. If attr1 is set, the operator is being applied to an already built argument record, not one constructed as part of the application, using the syntactic form APPLY [ operator, operands ]. call, portcall => [ operator: expr, args: list[ expr ] (, catch: catch | )] attr1: argsBuilt info: sourceIndex These nodes result from op[args], where the type of op is a proc or port type respectively. Note that args has been converted to positional form. subst => [ operator: expr, body: stmt ] info: sourceIndex This node results from the expansion of an inline procedure; it replaces the corresponding call node. signal, error, xerror, start, join => [ appl: expr ] -- before pass 3 info: sourceIndex These nodes result from SIGNAL expr, ERROR expr, RETURN WITH ERROR expr, START expr and JOIN expr respectively. In each case, the expr is expected to be an application, the operator of which has a type appropriate for the operation implied by the node name. In pass 3, these node names are pushed down to the appl itself and the node is replaced in the type by its son (see below). signal, error, xerror, start, join => [ -- after pass 3 operator: expr, args: list[ expr ] (, catch: catch | )] attr1: argsBuilt attr1: unlockMonitor -- for xerror only, after pass 3 attr2: localVariblesOnly -- for xerror only; see return info: sourceIndex These nodes result from pushing the node name down to an apply node, as described above. In addition, an apply node may be converted directly to one of these nodes according to the type of its operator. Note that args has been converted to positional form. syserror => [ ] info: sourceIndex This node results from ERROR appearing as a statement. block => [ decls: list[decl], stmts: stmt ] attr1: checked attr2: trusted info: bodyTableIndex This node results from a block. Normally, only a block with embedded local declarations generates this form; most blocks without declarations are represented by lists of statements. A block with labeled EXIT clauses is encoded as label[body~block[decls, stmts], exits~[...]], where stmts does not include the exits clauses. A block with OPEN or ENABLE is treated similarly, making the rest of the block the son of an open node or enable node respectively. An unfortunate consequence is the awkward nesting of scopes for these constructs (the locals of the block have the smallest scope). if => [ cond: expr, then: stmt, else: ( stmt | none )] info: sourceIndex This node results from a conditional statement, e.g., IF p THEN stmt ELSE otherStmt. If the ELSE part is omitted, else = Tree.Null. case => [ value: expr, cases: list[ item[tag: list[expr], value: stmt] ], else: ( stmt | none ) ] attr1: relETests attr2: constantTests info: sourceIndex This node encodes the SELECT ... FROM form of case statement, e.g., SELECT v FROM cases ENDCASE => else. Each item in the cases list has a list[expr] as its tag component and a stmt as its value component. The case test is the OR of these expressions, each of which is a relation with a left operand of Tree.Null (an = is supplied if a relational operator does not appear explicitly). Each pass of the compiler places the value tree in a global variable; the expression processors use that variable when they encounter Tree.Null as an operand. If there is no statement following ENDCASE, the else field is Tree.Null. bind => [ binding: item[tag: id, value: expr], selector: ( expr | none ), cases: list[ item[tag: ( decl | list[id] ), value: stmt] ], else: ( stmt | none ) ] attr1: indirect attr2: rtTest attr3: copy info: sourceIndex This node encodes the WITH ... SELECT form of case statement. There are two forms, distinguished by the kind of the tag fields in the cases list and also encoded, after pass 3, in attr3. If those tags are decls (new form; REF ANY or variant record discrimination), the discrimination tests the type of the value denoted by the value field of the binding. In this case, the tag in the binding must be Symbols.nullName, and selector must be Tree.Null. Execution of this construct assigns a copy (attr3 set) of the bound value to that variable declared in the first element of the cases list for which such an assignment is type correct. It then executes the corresponding stmt, in which the discriminated value is referenced through that variable. If those tags are lists of ids, this node encodes the old form of discrimination (unsafe and obsolescent). The binding opens the value to be discriminated; if the tag field of the binding is not Symbols.nullName, it introduces a name that can be used to reference that value in the selector, cases (with discriminated type) and else fields (see the discussion of open). Execution of this construct tests the tag of the opened value against the ids appearing in the cases list. If the value is a COMPUTED variant record, the expr in the selector field is used to compute a tag; otherwise, the selector field must be Tree.Null. References to the discriminated value within the cases are replaced by cast[binding.value]; thus the tree structure is reevaluated for each reference (open "by name"). In other respects, these nodes are similar to the case node shown above. do => [ for: ( upthru| downthru | forseq | none ), while: ( expr | none ), opens: ( list[ item[tag: id, value: expr] ] | none ), body: stmt, exits: ( list[ item[tag: list[id], stmt: stmt] ] | none ), finished: ( stmt | none )] info: sourceIndex This node encodes a DO statement. If there is no FOR clause, for = Tree.Null; otherwise, it encodes that FOR clause as described below. If there is no loop predicate, while = Tree.Null; otherwise, it encodes a WHILE test (an UNTIL test is transformed to WHILE NOT test). The opens list is interpreted as for an open node, and the exits list is interpreted as for a label node. The finished statement results from using FINISHED. return, resume => [ values: list[ expr | item[tag: id, value: expr] ] ] attr1: entryProc attr2: localVariablesOnly attr3: resultsBuilt -- Resume in Pass3M uses attr1, a bug ? info: sourceIndex These nodes result from RETURN or RESUME. A RETURN from an entry proc is marked to indicate that unlocking the monitor is part of the return operation. In general, the return record must be constructed before the lock is released; since this is (was?) awkward for the architecture, attr2 is set to indicate that any variables mentioned in the return list are local to the proc and need not be so protected. Normally, a RETURN statement builds the result record, as in RETURN[x]. If the return record is already constructed, as in RETURN P[x], attr3 is set. result => [ values: list[ expr | item[tag: id, value: expr] ] ] info: sourceIndex Inline expansion converts each return in the body of an expanded proc to result. This node appears only after pass 3 and is eliminated by pass 4 whenever the expanded body is sufficiently simple. reject => [ ] info: sourceIndex This node encodes an explicit REJECT statement. label => [ body: stmt, exits: list[ item[tag: list[id], value: stmt] ] ] info: sourceIndex This node results from the use of EXITS within a block. Each item in the exits list has a tag field which is a list of identifiers, and a value field which is a statement. The attr1 field of each item is used internally to detect whether at least one of the ids in the tag field is referenced. goto => [ label: id ] info: sourceIndex This node results from GO TO label. The label must be one of the identifiers used in an enclosing label node, or all bets are off. wait, notify, broadcast => [ condition: expr ] info: sourceIndex These nodes encode the WAIT, NOTIFY and BROADCAST operations. lock, unlock => [ monitor: expr ] info: sourceIndex These nodes are generated internally by the compiler, to lock and unlock the monitor upon entry to, and exit from, an expanded inline procedure. continue, retry => [ ] info: sourceIndex These nodes result, respectively, from CONTINUE or RETRY. free => [ zone: ( expr | none ), var: expr , ?: none (, catch: catch | ) ] attr1: countedContents attr2: longPath -- for the var attr3: countedZone info: sourceIndex This node encodes a FREE statement. If zone = Tree.Null, the default system zone is used. The var normally has the form addr[v] and is converted to v prior to code generation; when there is not an explicit addr operation, this operand is converted to uparrow[expr]. catchmark => [ scope: stmt ] info: sourceIndex This node is internally generated, by pass 3. It is used to place an implicit label required by a CONTINUE, or RETRY. The code generators keep a stack of such labels and use the top one when a jump target is needed. stop => [ ] info: sourceIndex This node represents a STOP statement. It may occur only in the main body of a module and is obsolescent. restart => [ program: expr ] info: sourceIndex This node encodes RESTART p, where p is a PROGRAM or a pointer to a global frame. It is obsolescent. syserror, null, exit, loop => [ ] info: sourceIndex These nodes result, respectively, from ERROR, NULL, EXIT or LOOP used as a statment. dst, lst, lste, lstf => [ state: expr ] info: sourceIndex These nodes represent operations for saving and restoring the STATE of the machine. They result from _ STATE, STATE _, TRANSFER WITH and RETURN WITH respectively. The operand must be a variable that is suitable for use as a state vector. open => [ opens: list[ item[tag: id, value: expr] ], scope: stmt ] info: sourceIndex This node results from the OPEN construct. Each item in the opens list has an id as the tag, and an expression as the value. An anonymous item has tag = Symbols.nullName. The scope of an item in the opens list is the remainder of that list plus the statement that is the scope. Processing of the scope in pass 3 generates additional references to the value fields of the opens items. Since this generates shared substructure, all the references to a value are funnelled through an openx node (see below) with that value as its son, unless value is itself a terminal node, in which case it is copied on each reference. checked => [ scope: stmt ] attr1: checked attr2: trusted info: sourceIndex This node attaches the CHECKED or TRUSTED attribute to a range of statements. Note that these attributes can also be encoded in the attr fields of a body or block node; a checked node is generated only when the scope is not a proper block (with local decls). enable => [ catches: catch, scope: stmt ] info: sourceIndex This node results from using the ENABLE clause within a block. The scope of the enable clause is the given statement, and the handlers for any signals (or errors) are given by the catch (see below). syscall => [ ? ] info: sourceIndex This node is generated internally to force a call through the system dispatch (SD) vector. ******************************************************************************* The following nodes are parts of statements but may not themselves be used as statements. ******************************************************************************* forseq => [ control: (id | decl), init: exp, next: exp ] attr2: counted attr3: composite This node encode the FOR id _ init, exp form of FOR clause. The attribute bits qualify the implied assignments (see assign). upthru, downthru => [ control: (id | decl), interval: range, ?: none ] attr2: counted attr3: composite These nodes encode the FOR id IN interval and FOR id DECREASING IN interval forms of FOR clause respectively. If the id is omitted (THROUGH interval), control = Tree.Null. The attribute bits qualify the implied assignments (see assign). caseswitch => [ ? ] This node is generated internally, by Pass4S. It is used to collect spans of case arms in which selection can be done by an indexed jump; all the case tests must be equality tests against constants, and those constants must be reasonably dense over their span. casetest => [ selector: expr, (stmt: stmt | expr: expr) ] This node is generated internally in conjunction with a caseswitch node to reencode the item (statement or expression) selected by a particular constant. ditem => [ tag: decl , value: (stmt | expr) ] attr1: indirect attr2: rtTest attr3: tagTest info: bodyTableIndex -- like a block This node is generated internally to reencode items in the arms of the new form of discriminating selection. See istype (below) for interpretation of the attributes. catch => [ catches: list[ item[tag: list[expr], value: stmt] ], else: stmt ] info: bodyTableIndex -- Trinty and later compilers In a catch node, each of the catches items has a tag that is a list of expressions, and a value that is a statement (or list of statements). The else field is used to mark the ANY case. The expressions must evaluate to signals or errors; the corresponding statement is the catch phrase for those signals and errors given by the values of the expressions. ******************************************************************************* The following nodes are used as expressions and may appear where expr appears as a kind. The info field is uniformly set in pass 3 to the type of the expression and is used by that and subsequent passes. In the current compiler, it is the result of calling UnderType and thus uniformly a CSEIndex. The user should also be prepared for an id, a symbol, or a literal to appear as an expression. Two patterns of attribute bit usage are common in expressions. In arithmetic operators, the attributes select among the various hardware implementations of the corresponding operations; see plus, etc. In operators that form addresses, attr2 indicates whether the corresponding address can be represented as a 16-bit (MDS relative) pointer, or whether a 32-bit representation is required (longPath). ******************************************************************************* apply => [ proc: expr, args: list[ expr | item[tag: id, value: expr] ] (, catches: catch | )] info: sourceIndex This node results from the application of a procedure to its arguments, with a potential catch phrase. Note that an apply node may be used as either a statement or as an expression, determined only by the enclosing context. Pass 3 converts each apply node to a node in which the implementation of the application to be performed is explicit. In an expression context, the possible resulting node names are those listed below, through rowcons. A number of the node kinds resulting from this conversion have analogous statement forms; see the discussion of those forms for information on the conversion and the resulting nodes. As an expression,, an application may have proc = Tree.Null, in which case it must be converted to one of the nodes with kind construct, rowcons or listcons. callx, portcallx => [ operator: expr, args: list[ expr ] (, catch: catch | ) ] attr1: argsBuilt info: CSEIndex See call and portcall. signalx, errorx, startx, fork, joinx => [ appl: expr ] -- before pass 3 info: CSEIndex signalx, errorx, startx, fork, joinx => -- after pass 3 [operator: expr, args: list[ expr ] (, catch: catch | )] attr1: argsBuilt info: CSEIndex See signal, error, start, etc. index => [ array: expr, index: expr ] attr2: longPath info: CSEIndex This node represents an array indexing operation, such as a[i]. dindex => [ desc: expr, index: expr ] attr2: longPath info: CSEIndex This node represents indexing operation applied to an array descriptor, such as d[i]. Note that it implicitly dereferences the pointer component of the descriptor. seqindex => [ seq: expr, index: expr ] attr2: longPath info: CSEIndex This node represents an array indexing operation, such as a[i]. It does not implicitly dereference or select select a field; thus s[i] is converted to s^.seqPart[i]. reloc => [ base: expr, offset: expr ] attr2: longPath info: CSEIndex -- = offset.REFERENTTYPE This node converts the relative pointer in offset to an absolute pointer by adding base and implicitly dereferences the result. construct => [ type: (type | node), fields: list[(expr | none)] ] info: CSEIndex This node represents the CONS operation of a record type. The fields list is positional with defaults make explicit; a null field is not set and has an undefined value (TRASH). union => [ tag: expr, fields: list[(expr | none)] ] info: CSEIndex This node represents the CONS operation of a union type. rowcons => [ type: (type | node), elements: list[(expr | none)] ] info: CSEIndex This node represents the CONS operation of an array type. The elements list is positional with defaults make explicit; a null field is not set and has an undefined value (TRASH). cons => [ zone: ( expr | none ), first: expr, rest: expr ] info: CSEIndex This node represents the CONS operation of the implicit record type that is generated to represent a LIST node. listcons => [ zone: (expr | none), list: list[expr] ] info: CSEIndex This node represents the LIST operation of the implicit record type that is generated to represent a LIST node; it abbreviates a nest of nodes with node kind cons. substx => [ operator: expr, body: stmt ] info: CSEIndex See subst. syserrorx => [ value: expr ] info: CSEIndex This node represents ERROR used as an expression. addr => [ var: expr ] attr2: longPath info: CSEIndex This node results from @expr. new => [ (zone: expr | none), type: type, init: expr (, catch: catch | ) ] attr1: readOnly -- = init attr2: long attr3: countedZone info: CSEIndex This node represents an application of NEW. uparrow => [ ref: expr ] attr2: longPath info: CSEIndex This node represents the dereferencing of a REF or POINTER value. dot => [left: expr, right: id] attr2: longPath info: CSEIndex This node results from selection of the form X.Y. It is converted to dollar (simple selection), dot (selection with one level of dereferencing left) or cdot (selection of a constant from an interface type or instance) by pass 3. Note that a dot node may also be used as a type, depending on the surrounding context. dollar => [left: expr, right: id] attr2: longPath info: CSEIndex This node results from selection of the form X.Y when left is not dereferenced and right does not name a constant (see dot). cdot => [left: expr, right: id] info: CSEIndex This node results from selection of the form X.Y when left is an interface type or instance and right names a constant (see dot). plus, minus, times, div, mod => [ left: expr, right: expr ] attr1: real attr2: long attr3: signed info: CSEIndex These nodes represent the obvious binary infix operations. Arithmetic operators currently have five possible implementations: 16-bit unsigned integer (CARDINAL) ~real, ~long, ~signed 16-bit signed integer (INTEGER) ~real, ~long, signed 32-bit unsigned integer (LONG CARDINAL) ~real, long, ~signed 32-bit signed integer (INT) ~real, long, signed 32-bit floating (REAL) real, ~long, signed The compiler chooses an implementation based upon the types of the operands (using context to resolve ambiguities) and records its choice by setting the attribute bits as indicated above. uminus => [ value: expr ] attr1: real attr2: long attr3: signed info: CSEIndex This nodes represents the unary infix operation of arithmetic negation. See above for interpretation of the attributes. relE, relN, relL, relGE, relG, relLE => [ left: expr, right: expr ] attr1: real -- 32-bit REAL comparison (IEEE conventions) attr2: long -- 32-bit comparison info: CSEIndex These nodes result from the obvious binary infix operations. in, notin => [ value: expr, range: range ] info: CSEIndex These nodes result from 'value IN range' or 'value NOT IN range'. not => [ value: expr] info: CSEIndex This node results from logical (Boolean) negation. or, and => [ left: expr, right: expr ] info: CSEIndex These nodes result from the obvious binary infix operations. ifx => [ test: expr, then: expr, else: expr ] info: CSEIndex This node results from an IF expression. See the discussion of if. casex => [ value: expr, cases: list[ item[tag: list[expr], value: expr] ], else: expr ] attr1: relETests attr2: constantTests info: CSEIndex This node results from case selection in an expression context. See case. bindx => [ binding: item[tag: id, value: expr], selector: ( expr | none ), cases: list[ item[tag: ( decl | list[id] ), value: expr] ], else: expr ] attr1: indirect attr2: rtTest attr3: copy info: CSEIndex This node results from discriminating selection in an expression context. See bind. assignx => [ lhs: expr, rhs: expr ] attr1: initialization attr2: counted attr3: composite info: CSEIndex This node results from using an assignment as an expression. See assign. extractx => [ lhs: expr, rhs: expr ] info: CSEIndex This node results from using an extractor as an expression. See extract. lengthen, shorten, float, base, length => [ operand: expr ] attr1: real attr2: long attr3: signed info: CSEIndex These nodes result, respectively, from LONG, SHORT, FLOAT, BASE, and LENGTH used as operators. The lengthen, shorten and float nodes are also generated by the compiler, to record short-to-long and fixed-to-float coercions respectively. (SHORT and FLOAT are currently not available as explicit source-level operators.) pred, succ, abs => [ operand: expr ] attr1: real attr2: long attr3: signed info: CSEIndex These nodes represent applications of the operators PRED, SUCC and ABS. See the binary arithmetic operators for interpretation of the attributes. min, max => [ operands: list[expr] ] attr1: real attr2: long attr3: signed info: CSEIndex These nodes result, respectively, from MIN and MAX. Any nonzero number of sons is acceptable. See the binary arithmetic operators for interpretation of the attributes. ord, val => [ operand: expr ] info: CSEIndex This nodes represent applications of the operators ORD and VAL. all => [ component: expr ] info: CSEIndex This node results from an ALL expression. arraydesc => [ args: list[expr] ] info: CSEIndex This node results from 'DESCRIPTOR [ ... ]'. The length of the list must be 1 (to use an array of known type) or 3 (to use explicitly given pointer, length, and type). nil => [ type: ( type | none ) ] info: CSEIndex This node a NIL value of the indicated type. If type = Tree.Null, the type must be implied by the target context. void => info: CSEIndex This node results from a NULL expression or an omitted expression in a constructor. atom => [ name: literal ] info: CSEIndex This node results from an atom literal; the name field contains a Symbols.Name. clit => [ c: literal ] info: CSEIndex This node results from a character literal, and is used to distinguish a character literal from an integer literal. textlit => [ s: literal ] info: CSEIndex This node is generated internally and used to represent a string literal that has type REF TEXT or Rope.Text. llit => [s : literal ] info: CSEIndex This node is used to represent a string literal that must appear in the local data of a module, which allows the string literal to be swapped along with the code for the module. It results from literals of the form "..."L. mwconst => [ n: literal ] info: CSEIndex This node represents a multi-word constant. create => [ program: expr ] attr1: frame info: CSEIndex This node encodes the form NEW Program, which is obsolescent. If attr1 is set, the argument is a pointer to a global frame; otherwise, it is a program variable. istype => [ value: expr , type: type ] attr1: indirect attr2: rtTest attr3: tagTest info: CSEIndex This node represents the ISTYPE operation, as in ISTYPE[expr, type]. The attribute bits are set to indicate the implementation of the test. If attr1 is set, the value is a pointer or ref and the tests apply to the referent. Attr2 indicates that a run time test of the type code (canonical referent type) is required; attr3, that a test of one or tag values is required. Both attr2 and attr3 can be set. narrow => [ value: expr, type: ( type | none ) (, catch: catch | ) ] attr1: indirect attr2: rtTest attr3: tagTest info: CSEIndex This node results from the NARROW construct. Note that NARROW[value] alone produces Tree.Null as the type field. See istype for interpretation of the attributes. loophole => [ value: expr, type: ( type | none ) ] info: CSEIndex This node results from the LOOPHOLE construct. Note that LOOPHOLE[value] alone produces Tree.Null as the type field. size, first, last, typecode => [ type: type ] info: CSEIndex These nodes result from type.SIZE, type.FIRST, type.LAST, and type.CODE used as expressions. cast => [ value: expr ] info: CSEIndex This is an internally generated node, used to indicate the result of applying a coercion that does not change the bits representing the coerced value. It is a bookkeeping device and is omitted when the resulting type can be pushed down to the value node (e.g., when the node kind of value is mwconst) or when the compiler does not need the additional bookkeeping information (e.g., most cases of record-to-single-component coercion). As a kludge, this node is also used with a null value as the init in a decl node to indicate that the corresponding variable is initialized by the context (e.g., the decl in the new form of discriminating selection). pad, chop => [ value: expr ] info: CSEIndex These are internally generated nodes similar to cast but used when the apparent length of the bit string representing the value changes. The pad operation occurs when a discriminated value is assigned or compared to an undiscriminated form that is longer; chop is used inside the arms of an old-form discriminating selection as an operator on the disciminated value when the variant is shorter than the general case. safen => [ value: expr ] info: CSEIndex This node is generated by pass 4. It only appears as the immediate son of a construction or extraction operation and indicates that the value should be precomputed and assigned to a temporary before evaluating the constructor or extractor, typically to protect the value from side effects or to ensure that REFs in argument records remain visible to the garbage collector (obsolete?). check => [ value: expr , bound: expr ] info: CSEIndex This is an internally generated node representing a bounds check operation. The bound is always a constant; the bias of the value is adjusted so that it passes the check iff it lies in the interval [0..bound). proccheck => [ value: expr ] info: CSEIndex This is an internally generated node representing the operation that checks for assigning a local proc value out of scope. It occurs only in Cedar CHECKED regions. gcrt => [ ref: expr ] info: CSEIndex This node represents the operation that obtains the canonical referent type of a REF ANY. It is used internally by the code generators to expand operations such as ISTYPE. openx => [ base: expr ] info: CSEIndex This node is used to channel the sharing of subtrees that arises from open or selection through a common node. See open. syscallx => [ ? ] info: CSEIndex This node is used to encode an explicit transfer through the system dispatch vector that returns a result. (Not currently used?) stringinit => [ size: literal ] info: CSEIndex This node is generated by pass 3 to encode the obsolescent form of string initialization that allocates space for a StringBody of size size in the local frame. procinit => [ ? ] info: bodyTableIndex This node is generated by pass 4 to indicate the need to initialize a variable with a descriptor for a nested procedure. It appears in the enclosing scope and replaces the corresponding body node. ******************************************************************************* The following nodes are used as parts of expressions and types, but may not appear as either expressions or types. ******************************************************************************* range = type | intCC | intCO | int OC | intOO A range is simply a convenient name for several possible kinds of nodes. intCC , intCO, intOC, intOO => [ low: expr, high: expr ] These nodes represent intervals, where C indicates a closed end to the interval, and O indicate an open end to the interval. exlist => [ vars: list[expr] ] info: CSEIndex See extract and extractx. initlist => [ ? ] XXX. self => [ ? ] info: CSEIndex This node is a transient that is used only during pass 3 during the conversion from object notation to the corresponding procedure call. thread => [ ? ] info: CSEIndex This node is used only by pass 3 and Pass3P. It chains together all calls of a given inline procedure, so that all such calls can be processed together by Pass3P. ******************************************************************************* The following nodes appear in Tree.mesa but are spares or obsolete (unused). ******************************************************************************* unit , spareTC, spareS3 sequence, mergecons