Page Numbers: Yes First Page: 8 X: 527 Y: 10.5"
Margins: Binding: 13
Odd Heading:
Compiler
Even Heading: Not-on-first-page
Compiler
Appendix A: Compiler
The Mesa compiler translates Mesa source files into corresponding object files. An object file contains the executable code for the module (if any), a binary configuration description (for use by the binder or loader), and a symbol table (for inclusion by other programs or for use by the debugger). By convention, an object file has a name with extension ".bcd".
The Mesa Language Manual describes the syntax and semantics of the Mesa source language. This appendix describes the operation of the compiler, including the compile-time options and messages.
Preparing Source Files
The compiler accepts ASCII text files. In a source file, any sequence of characters that begins with a ↑Z is skipped up to (but excluding) the next carriage return (or end of file). This convention accommodates Bravo formatting codes. You may use such formatting in your source files as you see fit. Note, however, that Mesa does not interpret any information about fonts, position, etc., attached to source text that it displays (e.g., in identifying the location of an error or breakpoint).
The recommended extension for naming any Mesa source file is ".mesa".
Standard Bravo macros useful during the editing and compilation cycle are described in the Context Switching section later.
Running the Compiler
The compiler takes commands from the command line. The command line syntax is described below in greater detail; the simplest form of command is a list of file names, such as
>Compiler.image sourcefile1 sourcefile2 ... sourcefilen
If you supply the command sourcefile with no period and no extension, the compiler assumes you mean sourcefile.mesa.
During compilation, the display is turned off and a die is displayed in the cursor. The number on the die identifies the pass of the compiler that is running. This allows you to check the progress of the compilation and also provides useful feedback to the maintainers of the compiler when something goes drastically wrong. The cursor moves down the screen to indicate progress through a sequence of commands and to the right as errors are detected.
Fine point:
Don’t confuse the compiler’s cursor with DMT’s.
The compiler reports the result of each command on the file Compiler.log with a message having one of the following forms (each * is replaced by an appropriate number; bracketed items appear only when relevant):
file.mesa -- source tokens: *, time: *
[code bytes:
*, links: *, frame size: *]
[
* warnings on file.errlog]
Compilation was successful. The object file is file.bcd. For a DEFINITIONS module, the middle line is not meaningful and is omitted. Otherwise, "links" is the number of items imported by the module, and "frame size" is the size of the global frame (in words), exclusive of the links. The third line appears only if warning messages were logged. The compiler issues warnings for certain constructs that are technically correct but nonsensical or likely to be unintended. Warnings do not prevent writing a valid object file, but you should usually investigate them.
file.mesa -- aborted, * errors [and * warnings] on file.errlog
Compilation was unsuccessful. You will find the error messages (and warning messages, if any) in the indicated file. If the errors were detected during the early phases of compilation, no object file was written (and any existing object file with the same name was deleted).
File error
The compiler could not find the specified file.
At the end of compilation, the message "Type Key" is displayed in a flashing cursor if there are errors or warnings. (You can change this behavior by using switches, which are described below.) Typing any key will cause the compiler to exit.
Fine point:
Typing Shift-Swat aborts the Executive’s current command sequence (i.e., causes any commands in REM.CM to be ignored); Ctrl-Swat invokes the Mesa Debugger; any other character causes normal exit from the compiler.
Command Line Arguments
The Compiler allows you to control the association between modules and file names at the time you invoke the compiler. The compiler accepts a series of commands, each of which has the form
outputFile ← inputFile[id1: file1, ..., idn: filen]/switches
Only inputFile is mandatory; it names the file containing the source text of the module to be compiled, and its default extension is .mesa. Any warning or error messages are written on the file outputRoot.errlog, where outputRoot is the string obtained by deleting any extension from outputFile, if given, otherwise from inputFile. If there are no errors or warnings, any existing error log with the same name is deleted at the end of the compilation.
If a list of keyword arguments appears between brackets, each item establishes a correspondence between the name idi of an included module, as it appears in the DIRECTORY of the source program, and a file with name filei; the default extension for such file names is .bcd. (If the name of an included module is not mentioned on the command line, its file name is computed from information in the DIRECTORY statement).
The optional switches are a sequence of zero or more letters. Each letter is interpreted as a separate switch designator, and each may optionally be preceded by - or ~ to invert its sense.
If outputFile (and ) are omitted, the object code and symbol tables are written on the file inputRoot.bcd, where inputRoot is inputFile with any extension deleted. Otherwise code and symbols are written on outputFile, for which a default extension of .bcd is supplied. If the compiler detects any errors, the output file is not written and any existing file with the same name is deleted.
The compiler accepts a sequence of one or more commands from the Executive’s command line (through the file Com.cm). Commands are separated by semicolons, but you may omit a semicolon between any two successive identifiers (file names or switches), or between a ] and an identifier (but not between an identifier and a /). Note that any required semicolon in an Alto Executive command must be preceded by a single quote (’).
You can set global switches by a command with an empty file name. In the form /switches, each letter designates a different switch. Unless a command to change the global switch settings comes first in the sequence of commands, you must separate it from the preceding command by an explicit semicolon.
Examples:
>Compiler ReadOldFormat ← ReadData[DataFormat: OldFormat]
Compile the program ReadData.mesa that has the included interface DataFormat in its DIRECTORY statement. Use the file OldFormat.bcd (which contains the declaration DataFormat: DEFINITIONS = . . .) as the source of this interface. Put the object program in the file ReadOldFormat.bcd.
>Compiler /-aj SymStuff[Table: LongTable]/n SymExtra[Table: LongTable]
Compile the files SymStuff.mesa and SymExtra.mesa, getting the definition of Table from LongTable.bcd. Produce object files SymStuff.bcd and SymExtra.bcd. Don’t produce code for the Alto, and cross jump both modules, generate NIL checks for SymStuff only (switches explained below).
Compiler Switches
Switches allow you to modify command input. A command has the general form
file[/s]
where [ ] indicates an optional part and s is a sequence of switch specifications. A switch specification is a letter, identifying the switch, optionally preceded by a ’-’ or ’~’ to reverse the sense of that switch. The valid switches are
aAlto instruction set (default)
bbounds checking
fimplementation of floating point operations (default)
jcross-jumping optimization
limplementation of long pointer instructions in Alto emulation
nNIL pointer checking
ppause after compiling file if there are errors
rterminate compilation and run the program contained in file
ssort global variables and entry indices (default)
uuninitialized variable checking
wlog warning messages (default)
ywarning on runtime calls
Each switch has a default setting, The command sourcefile is equivalent to sourcefile/a~bf~j~l~n~ps~uw~y if you use the standard defaults, i.e., the compiler generates Alto instruction set, converts long pointer accesses to short, does not cross-jump the code, does not pause after compiling file, sorts variables, and logs warning messages. It does not do bounds, NIL pointer, or uninitialized variable checking, and does not warn about runtime calls. It generates inline instructions for floating point operations. Note that the "r" switch changes the interpretation of file, which should name a subsystem when used with this switch.
You can also change the default setting of any switch by using a global switch. Any switch given with no file name (i.e., just a slash and switches) establishes the default setting for that switch. Unless overridden or reset, that default applies to all subsequent commands.
Fine Point:
Any global switches given at other than the beginning of the command line must be preceded by a semicolon (quoted to the Alto Executive), or the command parser will assume that they are local switches on the previous file. The command parser only allows a single slash after a given file, so some cases of missing semicolon are flagged.
Here is some information about the options:
a[lto instructions]
The compiler generates code according to the microcode restrictions of the Alto, including order of instruction bytes, alignment, and minimal stack restrictions. It also converts all long pointers to short before dereferencing them.
b[ounds]
If bounds checking is specified, the compiler inserts code to check that values are within range for all assignments to subrange variables and all indexing operations. Checking is also inserted for all assignments of signed values to unsigned variables and vice-versa. If the value is out of range, the signal BoundsFault is raised (see Mesa System Documentation). The compiler performs some bounds checking during compilation and does so independently of the setting of the /b switch. If it can deduce that no bounds failure is possible, the runtime check is omitted; if a bounds failure is unavoidable, it reports the error during compilation. Compile-time bounds checking is based upon the assumption that all variables are initialized before use.
Fine Point:
Bounds checking in indexing operations is suppressed if the declared index type is empty, e.g., [0..0).
f[loating point]
The compiler generates inline instructions for the common floating point arithmetic. If a program using variables of type REAL is to be run on microcode that does not support floating point, it should be compiled /~f.
j[umped]
Cross-jumping is a peephole optimization technique that potentially shortens the object code. The reduction in code size can range from negligible to 20% depending upon coding style. If cross-jumping is specified, the correspondence of source to object is no longer one-to-one. This affects the debugger’s ability to set breakpoints and identify code locations (see Mesa Debugger Documentation). On the other hand, one can still set entry and exit breaks on all procedures. The entire compiler and Alto Mesa system are cross-jumped.
l[ong pointers]
If specified, the compiler will generate code for a variation of the Alto instruction set that has additional long pointer instructions and fewer alignment and minimal stack restrictions. Programs compiled /l will not run on an Alto, even if they contain no long pointers.
n[il]
If NIL checking is specified, the compiler inserts code to check for a null value prior to any operation that dereferences a pointer. Note that indexing operations using an array descriptor or a string also imply dereferencing and are checked. If the pointer value is NIL, the signal PointerFault from interface TrapDefs is raised. No compile-time checks for NIL are performed.
Fine Point: No NIL checks are provided in the dereferencing of relative pointers.
Depending upon coding style, these runtime checks can increase the size of the compiled code substantially.
p[ause]
This switch is unusual in that its meaning is slightly different depending on whether it is a global or local switch. As a global switch, it specifies pausing (p) or not pausing (-p) just before exiting from the compiler in the event of errors or warnings; the global default is to pause. As a local switch, it specifies pausing just after compiling the specified file if that file or any preceding file contained errors; moreover, any remaining commands are ignored. The local default is not to pause but to continue with the next input file.
r[un]
This is described below under Context Switching.
s[ort]
Normally, the compiler sorts certain items by frequency of use before assigning addresses. This helps to keep the object code compact. If sorting is suppressed (-s), the assignments of global frame offsets and entry indices depend only upon order of declaration in the source text. This switch was added in anticipation of tools allowing inexpensive correction and replacement of modules in a configuration. These tools are not yet available.
u[ninitialized variables]
If the /u switch is given, the compiler issues warning messages for uses of apparently uninitialized variables (but not fields of records). The algorithm used to detect suspicious usage is based upon the following assumptions:
The entire body of a procedure is executed before the bodies of any procedures declared within it.
Within any procedure, the order of execution is equivalent to the order of appearance of source text (for the purposes of variable initialization).
The bodies of the contained procedures are executed in order of appearance.
The algorithm works fairly well for detecting certain common errors, but it is obviously not foolproof. There is no guarantee that all uses of potentially uninitialized variables are reported; conversely, properly initialized variables are sometimes flagged when the initialization depends upon the order of execution of subprocedures. (Performance with respect to global variables is improved by putting the initialization code for a module either in the main body or the lexically first procedure.)
w[arnings]
Log (w) or ignore (-w) certain legal but suspicious constructs that can be detected by the compiler.
y[ell about runtime calls]
This switch is intended for use by programmers writing such things as bootstrap loaders where the standard Mesa runtime machinery is unavailable. It flags operations, such as certain division, etc., that generate calls to system functions.
Examples:
>Compiler foo
Compile foo using all the default switch settings.
>Compiler foo/-wj
As above, but suppress warning messages and do cross-jumping.
>Compiler /-p file1 file2 file3
Use this form if you want the compiler to press on no matter what. If it is part of a command file, the next (Executive) command will be executed whether or not there were errors.
>Compiler file1 file2/p file3
Use this form if you want the compiler to pause before compiling file3 if either file1 or file2 does not compile successfully. If file3 depends upon the others (by including them), this can save a lot of wasted time and effort.
>Compiler file1/p ’; /-p file2 file3
Use this form if you want the compiler to pause before compiling file2 if file1 does not compile successfully. Press on to the next Executive command even if file2 or file3 does not compile.
Context Switching
If you are a Bravo user, you might find the following macros useful for switching between Bravo and the Mesa compiler. They are included in <Mesa>User.cm.
bravo/m filename
This invokes Bravo with two windows, gets filename.errlog in a smaller, bottom window, and gets filename.mesa in the top window. (Be sure not to use filename.mesa on the command line.)
q[uit]/m
This Bravo command writes out the file in the selected window (say filename.mesa) and terminates Bravo. It then specifies the following sequence of (Executive) commands:
>compile filename
>bravo/m filename
The command line switch "/r" (run) causes the compiler to terminate by running some other program instead of returning to the Alto Executive. You may specify either a ".image" or a ".run" file; if you omit the extension, ".image" is assumed. Any switches after the "r" and any other text remaining in the command line after the command specifying this switch are copied to the file Com.cm for inspection by the new program. This facility is primarily intended for use in command files.
Examples:
>Compiler sourcefile Mesa/r sourcefile
Compile sourcefile; then invoke Mesa.image to load and start sourcefile.bcd. Note that "Compiler sourcefile; Mesa sourcefile" has the same effect but is slower, because it returns to the Alto Executive before invoking Mesa. (There are overheads of several seconds associated with both restarting the Executive and reestablishing the Mesa environment.)
>Compiler sourcefile Ftp.run/r Iris store sourcefile.bcd
Compile sourcefile, then store the object file on Iris. Note that you must supply the ".run" extension to invoke Ftp in this way.
Fine point:
You can run Bravo using the "/r" switch, but the current version (7.5) will not correctly find switches or arguments on the command line.
Error Messages
The compiler writes error and warning messages for sourcefile.mesa on sourcefile.errlog. Each pass detects certain classes of errors. Error messages are logged in (approximate) source order by each pass. Within a single pass, the compiler does its best to complete its analysis in spite of any errors. With the exception of "correctable" syntactic errors, detection of an error by one pass causes all following passes to be skipped. Thus you will sometimes get a new set of error messages after correcting all those reported by a previous run of the compiler. The compiler never writes a bindable or loadable object file if it detects any errors.
The compiler also logs warning messages. These are advisory only and are intended to draw your attention to suspicious usage. They do not abort compilation or invalidate the object file (but they should be checked).
Here is a trivial and nonsensical program that illustrates the form of the compiler’s error messages.
Sample: PROGRAM =
BEGIN
i: INTEGER,
i ← j+TRUE;
END.
i: INTEGER,
↑ Syntax Error [46]
Text deleted is: ,
Text inserted is: ;

j is undeclared, at Sample[52]:
i ← j+TRUE;

TRUE has incorrect type, at Sample[52]:
i ← j+TRUE;
The first message is generated by the first pass and shows how syntactic and lexical errors are reported. The arrow points to the first symbol that is necessarily invalid (or one symbol before it), and the decimal number is a character index in the source file. Of course, the compiler cannot know what you intended, and the "real" error might have occurred quite a bit earlier. The compiler tries to fix these errors as best it can by local deletion and insertion of symbols. These symbols are not written into the source file but are reported to help you interpret subsequent messages. If the compiler cannot find a way to continue parsing, or if too many of these errors accumulate, it gives up.
Fine Point:
In order for the arrow to line up under the syntax error, you need to be viewing the file with a fixed pitch font.
If you are editing the program and its error log in the debugger, you can use the Position command on one of the menus of the source window to locate the errors, given the character indicies in the error log.
The other error messages report "semantic" errors. Errors are located by displaying a line of source text (the second line in each message) as well as the character index (a decimal number) and the enclosing procedure or program name (the identifier preceding the number). The text of the error message is intended to be reasonably self-explanatory. Sometimes it refers to an identifier or expression. The compiler reconstructs these expressions from the parse tree; in later passes, the reconstruction often reflects rearrangement or constant folding so it may not exactly duplicate the source code. As subexpressions, "?" indicates an undeclared identifier and "..." indicates either a cutoff because of depth of nesting or an expression form the compiler cannot reconstruct from the parse tree.
Compiler Failures
The message reporting a compiler failure has the following form:
FATAL COMPILER ERROR, at id[index]:
(source text)
Pass =
n, signal = s, message = m
Such a message indicates that the compiler has noticed some internal inconsistency. The compiler will skip the remainder of the command line if this happens. If you get such a message (or encounter other compiler problems), you should submit a change request as described in Section 1.8. Be sure to preserve the relevant files and to mention the octal codes identifying the pass (n), signal (s) and message (m) in your change request.
Current Limitations
The following limits are built into the current implementation of Mesa and are enforced by the compiler:
The number of interface items declared in a single DEFINITIONS module cannot exceed 128.
Neither the number of procedure bodies nor the number of signal codes defined in a single PROGRAM module can exceed 128.
The size of the frame or record required by a procedure or program cannot exceed 4096 words.
Procedure declarations cannot be nested more than five levels deep, counting catch phrases as procedure levels.
The compiler allocates its internal tables dynamically and tries to adjust their relative sizes to accommodate the program being compiled. When it is unsuccessful, it reports failure with a message of the form:
Storage Overflow in Pass n
Usually, the best thing to do is split your program into two or more smaller modules. If the Pass is 5, you can sometimes get your program compiled by removing code from the main body (into a procedure called by the main body), or reordering the procedures so that the largest ones come near the end. The reason that this works is that pass 5 reuses the parse tree space from earlier procedures to hold code generated for later procedures. If the main body (first processed) or one of the first procedure bodies is large, there is not much space for the code.