CSL Notebook Entry
To Cedar Programmers Date February 17, 1984
From Ed McCreight, Lyle Ramshaw Location PARC/CSL
Subject Pascal-to-Mesa Translation File [Indigo]<Tex>PasMesa>PasMesa.tioga
XEROX
Abstract PasMesa is a Pascal-to-Mesa source translation tool that runs in Cedar. It can help import Pascal programs into either the Cedar or Alto Mesa 6 environments.
An Overview of PasMesa (Ed McCreight)
Mesa could well be described as a derivative of the Pascal programming language. Most of the features of Pascal are also available in Mesa, and the two syntaxes differ mostly in detail. But Mesa, even though it may be an excellent language, is available only within Xerox, loosely construed. Software interchange with the rest of the world can only take place in a more widely-used language, and Pascal is a reasonable choice (although Fortran would be another).
The most straightforward way of importing a relatively small amount of Pascal software into a relatively large Mesa environment is to translate the Pascal as directly as possible into Mesa source code. The translation is straightforward precisely because of the simplicity of Pascal and the similarity of the two languages. I have implemented in Mesa a Pascal-to-Mesa translator called PasMesa. A fair thing to say about this translator is that it has compiled some Pascal programs to Mesa, and, usually after some modifications, the programs have worked. It is a bit early to claim that PasMesa will translate any Pascal program to runnable Mesa. In any case, PasMesa's translations are not perfect. I assume that the person doing the translation is both a Pascal and a Mesa programmer, and that he is prepared to do an average of 15 to 30 minutes' work per (paper) page of source code to repair the translation where it failed. All known translation failures result in errors that are caught by the Mesa compiler (except for backward non-local GOTO statements, which result instead in uncaught signals; see Restriction 6 below).
Error recovery in PasMesa is non-existent, and the error reporting is only rudimentary. PasMesa is not at all intended as a means by which one could reasonably develop a Pascal program from scratch. I make no guarantees about what will happen if the Pascal program will not pass through somebody's standard production Pascal compiler without complaint. If you want to write a Pascal program for export to the external world, I suggest that you do it on a machine with a real Pascal compiler, such as a VAX, and then import the result to Mesa.
The Pascal program is assumed all to fit on one (perhaps gigantic) Pascal source file. To coalesce a multi-module program into a single module looks like a straightforward editing task; however, I might be receptive to a chorus of complaints in this area.
One final generality: PasMesa runs rather slowly. I can think of ten ways to speed it up, and I have done none of them. If PasMesa, in its whole lifetime, executes for more than 10 Dorado-hours, I will be amazed, and I might then be willing to put some work into tuning. Otherwise, just be patient. Remember, it's faster than doing it with a text editor.
Further generalities (Lyle Ramshaw)
As part of an effort to import Donald E. Knuth's Tex82 document compiler into Cedar, I have translated Ed McCreight's PasMesa itself into Cedar, and added various new features. Most of these new features are connected with making it possible to do the translation of a large system completely automatically. That is, I considered it cheating to have to modify the resulting Mesa code by hand; instead, I would add some other funny feature to PasMesa itself.
The code that this new PasMesa outputs is intended to run in Cedar; I made some effort to arrange things so that the new PasMesa could also produce code that would compile under Alto Mesa 6, but my success at this endeavor has not been tested. Furthermore, I have not made any effort to produce a version of the Pascal runtime routines in Alto Mesa 6 that is compatible with the new PasMesa.
Not only does the Cedar PasMesa run slowly, it also runs impolitely! It does many, many allocates of long and short Ropes, and hence tends to pollute your virtual memory quite severely. These problems could be fixed with work, but I wasn't willing to work on the performance of PasMesa any more once it could translate TeX at least as fast as the Cedar compiler could compile the result. I recommend that you Rollback after running PasMesa extensively, and perhaps before as well.
Running the Compiler (Lyle Ramshaw)
The compiler is called PasMesa, and it can be obtained by typing
Bringover -p [Indigo]<Tex>Top>PasMesa.df´
to a CommandTool. You can invoke PasMesa itself on a particular program by typing
PasMesa maze´
where ``maze.mod'' is the name of a text file that you must have already made up that gives PasMesa its instructions. Greg Nelson wrote a program called Maze.mesa in Cedar as a sample for the summer interns of 1983. I translated Greg's program by hand into Pascal, and then I used PasMesa to translate it back into Mesa. I shall use this ``maze.pas'' program as an example. If you want to look at other examples, you can check out the programs Tangle, TFtoPL, PLtoTF, PoolType, DVItype, and (for a final exam) TeX, all available through DF files on [Indigo]<Tex>Top>.
The ``.mod'' file gives PasMesa basically two kinds of information. First, most significant Pascal programs are too large to fit into one Mesa module that the current Mesa compiler can digest. To address this problem, PasMesa has facilities for taking the various components of the outermost block of the Pascal input program and parceling them out to a collection of Mesa modules. The ``.mod'' file declares these Mesa modules and tells PasMesa what outer block items to put where. After this modularization information, the ``.mod'' file also allows the user to specify the settings of various switches and parameters.
The format and features of ``.mod'' files are rather complex, rather like those of DF files. The reasons for this complexity are probably largely the same as well: in both cases, a body of software that had to deal with the complex Cedar development world gradually grew up by accretion of features. But enough of apologies. Let's look at ``maze.mod'':
(configuration maze(graphics, graphics←to←press, random, pascalsic, pascal←novice𡤏iles);
definitions maze←private(pascalsic*); forward;
program maze←impl(pascalsic, pascal←novice𡤏iles, maze←private, random); forward;
program maze←graphics←impl.hint$(graphics, graphics←to←press, maze←private, pascalsic); forward;
begin
maze←private, maze←graphics←impl := draw←maze;
maze←private, maze←impl := $rest$;
end) := maze.pas;
INVENTFILENAMES TRUE;
MAKEFILE compile←maze;
PREDEFINE
function choose(i,j: integer): integer; external;
function init(i,j: integer): integer; external;
begin
end
TARGET cedar;
You're probably wondering right off why there are back-arrows in the middle of the identifier names. Those characters have code '\137. In the Xerox character set, that is a back-arrow all right. But in ASCII, it is an underline; as such, it is the character generally used when writing Pascal code to separate the words in multi-word identifiers. PasMesa, like all Pascals, ignores the case of characters in its input. When translating an identifier to Mesa, it capitalizes the first letter and makes the rest lower case. If the identifier contains any underscores, they are removed, and the letter immediately following the underscore is made upper case as well. Thus, the Pascal identifier ``graphics←to←press'' will be translated by PasMesa into the Mesa identifier ``GraphicsToPress''. It's a little funny at first, but you'll get used to it. Remember that assignment in a Pascal program uses the operator ``:='' rather than a left-arrow. For more details and a pointer to a nifty hack program named CapsArrows that will help you translate back and forth between these two identifier syntaxes, see Restriction 1 below.
The first half of the mod file looks like a bastard form of Pascal block. You should think about it as a declaration of the configuration that the Pascal program is going to be translated into. PasMesa assumes that every Pascal program will be broken up into a number of modules that will be bound up in one binding step into the final configuration. The identifier after the word ``configuration'' is the name of that config; in our example, that name is ``maze'' in Pascal form, which translates into ``Maze'' in Mesa form. PasMesa will write the config itself onto the file named ``maze.config''.
The identifiers in the parentheses after the word ``maze'' are the arguments to the config, that is, they are the definitions modules that this config will import. Therefore, the config that PasMesa writes will have a header of the form:
-- file: Maze.config
-- Pascal-to-Mesa Configuration
-- Pascal-to-Mesa translator output, translated at January 22, 1984 4:22 pm
Maze: CONFIGURATION IMPORTS Graphics, GraphicsToPress, Random, PascalBasic, PascalNoviceFiles
The next three lines declare the modules that are involved. Definitions modules are declared with the keyword ``definitions'' and implementations modules are declared with the keyword ``program''. Each module that you declare takes, as arguments, the names of the modules that it depends upon. PasMesa is a one-pass operation: you must declare a module before you use it as an argument to another module. Any module that is imported by the config counts as declared, of course. Usually, you will want to declare all of the definitions modules first and then the implementations modules, so that any of the latter can take any of the former as arguments. The order in which you declare modules is also the order in which they will be compiled, by the way.
What does it mean for one module to take another as an argument? It usually means to DIRECTORY, IMPORT, and OPEN it. Pascal has no sense of name scopes local to a module, and PasMesa only outputs unqualified Mesa names. Thus, a module has to OPEN anything that it DIRECTORY's in orer to get any good out of it. Note that there can't be any name conflicts in the Mesa version of the program because there weren't any name conflicts in the Pascal version. IMPORT'ing is a bit of an issue, though. The current Cedar compiler will issue a warning message if a module IMPORT's something that it doesn't really need. Furthermore, you can't turn off this warning message without turning off all warning messages, which would be rather dangerous. Thus, PasMesa offers the following feature: if you put an asterisk after the name of an argument in the declaration of a module, the declared module will DIRECTORY and OPEN but not IMPORT that argument. I recommend that you leave the asterisks off until you get warnings from the compiler, and then add them as needed.
A word for wizards: When one module takes another as an argument, the argument module is almost always a definitions module rather than a program module. But not always. It is possible in both Mesa and Cedar for a module to DIRECTORY and IMPORT a program module; the former is said to be taking the latter as a POINTER TO FRAME. Doing so has definite disadvantages: it introduces a compile-order dependency between implementations modules and it confuses the debugger in many cases. But it also has an advantage that is relevant to PasMesa users: the importer gets access to all of the global variables of the importee, and the importee can have lots and lots of variables. You see, implementations modules have more slots in their global frames for variables than do definitions modules. Mumble. To make a long story short, PasMesa does allow you to declare a program module that takes another program module as an argument; do so only at your peril.
Each declaration of a module ends with one of three words: either ``forward'', ``external'', or ``trash''. The keyword ``forward'' corresponds to the normal case; PasMesa will produce source text for a forward module, will request that it be compiled at the appropriate time, and will request that it be bound into the resulting configuration. The keyword ``external'' is used for modules that are to be bound into the configuration, but should neither be written by PasMesa nor compiled anew. The prime examples of ``external'' modules are the modules that implement the Pascal runtime environment. The program Maze imports the pascal runtime from the outside world, but many Pascal programs choose to bind the pascal runtime routines into their configs. Such programs declare the various modules of the runtime and specify them to be ``external''. The third possibility is ``trash'': PasMesa won't write, compile, or bind a module that is declared to be ``trash''; this gives a way of naming a place to put things that you don't want. In summary, the three options can be thought of as follows:
forward: write this module, compile it, and bind it
external: don't write or compile this module, but do bind it
trash: don't write, compile, or bind this module.
The table above tells you what the various options mean for implementations modules, but the situation is a little different for definitions modules. In the definitions case, the notion of binding doesn't make sense. Instead, the important question is whether or not various implementations modules should be marked as exporting this interface or not. The basic rules by which PasMesa figures out who exports what are explained below. But they are modified for the case of a definitions module that is declared to be trash: PasMesa will guarantee that no implementations module exports a trash interface.
Everything in Maze.mod should make sense now up through the keyword ``begin'' except for the funny extension specified for MazeGraphicsImpl. What is going on here is the following. The procedure that draws the maze is easier to write in Cedar directly, since it wants to call procedures from Graphics, GraphicsToPress, and the like. If you give an explicit extension on any file name, PasMesa will use the filename that you have specified; if not, it will use the appropriate default extension for the file: either ``.cm'' or ``.mesa'' or ``.config''. In this case, PasMesa will write a header for the DrawMaze procedure into MazeGraphicImpl.hint$ with a body of ``??'', pointing out that you must write this procedure yourself. When you replace ``??'' by the correct body, you should store the result out on the file MazeGraphicImpl.mesa, where the compiler will be able to find it. Future runs through PasMesa will now avoid smashing your hand-written file because the PasMesa output will once again be written on the file MazeGraphicImpl.hint$.
Between the ``begin'' and the ``end'' are a sequence of lines that look sort of like funny assignment statements. Each left hand side must have precisely two predeclared module names; the first must be a definitions module and the second an implementations module. On the right-hand side of the statement is a sequence of names of top-level Pascal items. PasMesa is prepared to help you split up the outermost block of your Pascal program. Splitting up inner blocks is much harder, and PasMesa offers you no help in that regard; so, if your Pascal program has a single procedure that is too long to fit, even all by itself, into a Mesa module, then you are in trouble. But PasMesa does allow you to split up the outermost block however you see fit. Call each thing in the outermost block that could generate corresponding Mesa code an item. There are four major types of items: types, constants, variables, and procs (procedures and functions). The executable code on the outermost block is also an item by itself. And it turns out that labels in that executable portion can also generate Mesa code, since non-local gotos to those labels are implemented using ERROR's, and someone has to declare the ERROR's. Thus, there are six kinds of items all told. In case you haven't guessed already, each statement in the block in the ``.mod'' file tells PasMesa that the definitions code and implementations code generated by the listed top level items is to be placed in the specified definitions module and implementations module.
In addition to giving the names of items explicitly, PasMesa allows you to give defaults in various ways. These defaults look like funny item names that begin and end with a dollar sign, to avoid name conflicts with genuine items. The four names $other←procs$, $other←types$, $other𡤌onsts$, and $other←vars$ tell PasMesa where to put those items of each of the four classes that aren't explicitly sent somewhere else. The pseudo-item $main$ means the executable code of the outer-most block, and the pseudo-item $global←labels$ tells PasMesa where to declare necessary ERROR's to implement non-local gotos. In addition, there is a super-default called $rest$, which tells PasMesa where to put everything that isn't classified by the rules above. The Maze program is quite lazy about breaking things up: it puts all defintions into MazePrivate, the external DrawMaze into MazeGraphicsImpl, and everything else into MazeImpl. Only small programs can get away with this easy-go-lucky attitude.
One more comment about the body of the block in the ``.mod'' file, concerned with EXPORTing. PasMesa has to figure out which of the implementations modules in a large system EXPORT which definitions modules. Instead of asking you for this information explicitly, PasMesa figures it out in the following devious manner: if there is a line of the ``.mod'' file of the form
ADefs, BImpl := stuff
in which some stuff is stuck in ADefs and BImpl, then BImpl is assumed to export ADefs. You can prevent PasMesa from drawing this conclusion, if necessary, by putting an asterisk after ADefs, as in the line
ADefs*, BImpl := stuff
Also, as mentioned above, ADefs won't be exported by anybody if it was declared to be trash.
After the ``end'' in the mod file comes a right paren and an assignment operator followed by the name of the source file where the Pascal program can be found.
The mod file finishes up with list of switch names and switch settings by which the behavior of PasMesa can be appropriately modified for different situations. The switches must come in the correct order, and that order is alphabetical according to the name of the switch. The Maze example has only a few switches. The InventFileNames switch (remember, case is not significant in Pascal text) tells PasMesa to assume that the PascalNoviceFiles version of the file IO portion of the Pascal runtime package will be used. This version of file support tries (or tried, back when Ed wrote it) to be very nice to you. In particular, it tries to allow you to think of files in Pascal's unusual manner, as a sort of extensible record. The files are actually stored on disk, and the name of the file on the disk is derived from the name of the file variable in your Pascal program. To make this work, PasMesa has to generate some calls to a routine from the PascalNoviceFiles runtime package; the InventFileNames switch requests this service.
The MakeFile clause gives the name of the file on which you would like the compile-and-bind command file to be written; the default extension is ``.cm''. In the example, the command file will be called CompileMaze.cm.
The PREDEFINE clause is a place where you can put the declarations of procedures that will be available to the Pascal program even though they are not defined there. The block that you give here is treated as if it enclosed the outer block of your program. If editing your Pascal source program isn't a big problem, another alternative is to include such external function definitions in the outermost block there. In either case, the implementations of the external functions are done directly in Cedar.
The final switch that Maze invokes is to request that the output be intended for use by Cedar, rather than be either ``Mesa'' (meaning AltoMesa6) or ``Long Mesa'' (AltoMesa6 with LONG POINTERS).
Maze is a simple program, and doesn't stretch PasMesa to it's limits; in fact, there are several switches that Maze doesn't need. Our next subject is a discussion of the rest of the possible switches and what they do.
There is an optional CAPITALIZE clause because of the unfortunate history of Pascal: in the original implementation, the character set was a 64-character set not even vaguely related to Ascii. PasMesa implements CHAR as full Ascii, including upper- and lower-case characters, but you may come across a program that is not prepared to deal with lower-case characters, even though they appear in the Pascal source file, because the programmer expects the Pascal compiler to capitalize everything. If you say CAPITALIZE CHARS, string constants that PasMesa represents as single characters or as packed arrays of characters be capitalized. If you say CAPITALIZE EVERYTHING, PasMesa will even capitalize things that it will represent as Mesa strings. Mostly what this means is that the string constants that you write with WRITE or WRITELN will be capitalized. If you omit this clause, PasMesa will compile exactly the case you use in string or character constants.
The optional CompilerSwitches clause is followed by a Pascal string (written with single quotes; remember, you are talking to a Pascal compiler!); this string will be put into the compile-and-bind command file at the place where Mesa compiler switches belong. For example, we chose to compile the Mesa TeX with bounds checking and nil checking turned off, since this helped us out a little with our storage overflows in the Mesa compiler. Thus, the file TeX.mod includes the clause ``COMPILERSWITCHES '/-b-n';''.
Large arrays in Pascal programs cause PasMesa no end of hassle, since Mesa is not particularaly well equipped to handle them. One of the ways that you might want to translate a Pascal array is as a Cedar ``computed sequence''. If you want to use this technique, you give the keyword ``ComputedSeqArray'' here, followed by a list of names of the arrays that you want translated this way. The names of the arrays can be fully qualified: ``mem𡤊rray'' means the variable ``mem𡤊rray'' in the outmost block, while ``output.names'' means the variable ``names'' that is local to the top-level procedure named ``output''. Qualification can go on as deep as necessary: ``a.b.c.name'' will work as well.
The next option has the keyword ``INLINE'', followed by a list of procedure names or function names (or array names, if you have also chosen to implement these arrays with the ProcArray feature discussed below, in which case they are really procedures also). If you specify that a proc should be INLINE, Mesa will translate the body of the proc into the definitions module instead of the implementations module, and will mark it as INLINE. An adroit use of inlines can speed up your program substantially; for example, TeX's main memory array is implemented as an inline ProcArray. But be very cautious about this, since using inlines will make your module size problems worse.
The optional NAMED PARAMETERS clause tells PasMesa that the syntax for parameter procedures is like this:
PROCEDURE foo(x: INTEGER; PROCEDURE baz(y,z: INTEGER)).
Otherwise it assumes that it is like this:
PROCEDURE foo(x: INTEGER; PROCEDURE baz(INTEGER;INTEGER)).
The former form is clearly the more rational, and the one that Mesa uses, but the latter has some currency in the Pascal community. The original syntax in the Pascal User Manual and Report is:
PROCEDURE foo(x: INTEGER; PROCEDURE baz),
which is clearly inferior, since it gives you no clue to baz's parameters. PasMesa will also handle this form, but both it and the Mesa compiler thereby assume that baz takes no parameters. If you call foo giving as a parameter a procedure that takes parameters, the Mesa compiler will complain bitterly (and rightly so). In this paragraph the concept PROCEDURE is intended to include FUNCTIONs. Of course, many Pascal implementations don't allow procedures or functions as parameters at all.
The next option is PROCARRAY, followed by a list of the fully-qualified names of the variables that you would like to implement this way. PasMesa will replace the declaration of the array ``Array[IndexType] of ElementType'' by a declaration of a function that takes IndexType as its argument and returns a LONG POINTER TO ElementType. You get to supply the body of the procedure in Cedar.
Still one more option about arrays. This switch is called ``SPECIAL ARRAY'', and is followed by a list of fully-qualified names of arrays. The issue here is whether to store the array on the stack or in the heap. If you specify that the target language is Cedar, the default is to allocate arrays in the heap, that is, to replace an array variable by a variable of type LONG POINTER TO an array, with appropriate dereferencing on array accesses. If the target language is Mesa or Long Mesa, then the default is to allocate arrays on the stack. In either case, you can change from the default behavior to the opposite behavior by calling the array SPECIAL. Arrays that are allocated in the heap are allocated from the system UNCOUNTED ZONE. The ones that are local to a procedure are deallocated when that procedure returns, while the global ones are never deallocated.
The Pascal runtime
If you wrote your ``.mod'' file just right, you should be able to shove your program through the Mesa compiler after it has gone through PasMesa by just typing the name that you specified for the MakeFile. This MakeFile will compile the definitions modules first, in the order that you declared them; then, the implementations modules. Finally, it will call the binder on the top-level config.
In order that the resulting bound file should be runnable in Cedar, you will have to arrange that it can get at the services of the Pascal Runtime package, available via the DF file
[Indigo]<Tex>Top>PascalRuntime.df.
The runtime services needed by a Pascal program have been divided up into various classes, with different interfaces for each class. Given that there is no way in Mesa to bind up several interfaces into a bigger interface, I couldn't figure out any better way to proceed. The issue is that different Pascal programs want to have different file systems under them, and some Pascal programs don't use Sets at all, while the rest of the runtime stuff is common to all Pascal programs. Hence, there is a PascalBasic with the basic stuff, three different file interfaces, and a Sets interface, along with implementations for each. The PascalNoviceFiles package tries to be really nice to the novice programmer. There is code to make text files avoiding reading one character ahead (as most Pascal files do), so that terminal interaction can work correctly. In Cedar 4.4, there was code to have the PascalOutput file appear in a viewer on your screen as well as in your file system, so that you could see your program at work; but I fear that I didn't get that working in Cedar 5.0 (the Cedar 4.4 implementation used DribbleStreams, which went away). PascalWizardFiles is a much thinner layer on top of IO.STREAM; this is more to the liking of big applications programs like TeX, which generally open files and the like by calling Cedar procedures that are declared as external to the Pascal program in any case. PascalInlineFiles is an inline version of PascalWizardFiles. Be warned that using the inline version will make the modules into which you have broken your Pascal program somewhat less likely to make it past the size limits of the Mesa compiler.
If you choose to use the PascalNoviceFiles version of the runtime file support, you should set the InventFileNames switch in your ``.mod'' file to TRUE. You will then be working in a world with the following properties (at least, this is what Ed McCreight built at one time): Except for the standard files, all other files by default carry the name of the file variable. To alter the default, you call a procedure PascalAttachFile in PascalNoviceFiles with several arguments including a string containing the name of the file, and after that you RESET or REWRITE the file and everything proceeds normally. I usually insert this call on the Mesa side of the compilation, and it goes like this if f is an input text file, and you want it to be called "foo":
PascalAttachFile[file: @f.baseFile, name: "foo", dir: input, itemSize: byte];
The other value of the dir parameter is output. The itemSize parameter is byte for file of character and word for anything else. The name parameter defaults to NIL, and itemSize defaults to byte. If you omit the name parameter, it will prompt you. There is no built-in editing: if you make a mistake, you start over.
The text file T is a record with a 6-bit control field, T.control, whose definition can be found in PascalNoviceFiles. Its purpose is to control how T will handle upper/lower case and control characters. The default is that a file will read upper and lower case but will read all control characters except CR as blank, and that the file will write upper and lower case, and all control characters exactly as specified. The default is applied only at module START time. After that you can change these control fields in Mesa to your heart's delight.
Refs versus Long Pointers
Suppose that PasMesa is compiling some Pascal program into Mesa code intended to run within Cedar. At first blush, you might suspect that the right thing to do would be to have PasMesa use REF's and Cedar safe storage as the implementation of Pascal pointer data types. It even seems at first blush that there might be a reasonable chance of making PasMesa's output ``safe'' in the Cedar sense. Unfortunately, these hopes are torpedoed by the presence of VAR parameters in Pascal. After all, if VAR parameters were easy to implement within the current Cedar, they would have been implemented. There is a plan to implement them someday, even though they are hard, and perhaps PasMesa's output can be made SAFE at that time.
Why are VAR parameters hard? First of all, note that one can't implement VAR parameters simply by using REF's, since the actual supplied for a VAR parameter may be a component of a record or array, while REF's can only point at top level objects. This problem could be solved by passing both a REF and an offset as the way of referring to a supplied actual by VAR. But there is a more subtle problem. Under normal conditions, REF's only point at objects in the heap, that is, in counted zones. But the value being passed by VAR might be in a counted zone, or in an uncounted zone, or in a frame. Even if we were willing to stretch the rules somewhat, and use LOOPHOLE's to acquire REF's to uncounted storage, we wouldn't know in general when to increment and decrement reference counts and when to leave them alone.
Hence, even in Cedar, PasMesa uses LONG POINTER's rather than REF's. This means that Pascal programs are unsafe, in the technical Cedar sense.
There is one place, however, where Pascal files are allowed to play with REF's, and that is in the various implementations of Files. The Pascal runtime, given the Pascal record that represents a Pascal file, has to be able to get its hands on an IO.STREAM, the Cedar object that implements the file. And an IO.STREAM is REF-containingwe don't have any choice about that. The safest way to do this association would be to let the Pascal program (which is unsafe code, after all) deal only with integer indices, called JFN's in TENEX. The runtime would have a data structure to map these JFN's quickly into the appropriate IO.STREAM. But I was too lazy to do write the necessary code, so I played a little faster and looser. The Pascal record that represents a file actually contains a LONG POINTER to a record in counted storage, one field of which is a REF to the IO.STREAM. Because this record is referred to by LONG POINTER's, it's reference count is not reliable. The Pascal runtime deals with this by linking all of these records onto a single linked list with REF's, so that they will have reference count 1 in perpetuity; they are never freed. This isn't a big problem, since the records are not large.
Restrictions and Limitations
Mesa 6 has several annoying restrictions and limitations that are not present in most Pascal implementations. For some of these I have provided a detour; for others you will need either to edit the Pascal program or to edit the resulting Mesa.
1. Identifiers. For comparison purposes, Pascal identifiers are capitalized and all characters are compared. This can (and has) cause some identifiers that were treated as the same by some other Pascal compilers to be treated as different by this one (e.g., if their first difference is in the 15th character). This happens so often that I put in a special feature that tries to figure out what was probably meant. It looks for any exact match in the context stack, or failing that, the longest match among all identifiers in the context stack with the correct first eight letters. There may still be mismatches in the translated Mesa, but at least PasMesa itself doesn't blow up.
In the Mesa translation, the first letter is capitalized and all others are in lower-case. This avoids stepping on Mesa's reserved words. The character 137B (underscore, or ← in Xerox' version of Ascii) is not translated, but rather causes the next translated letter to be capitalized.
You may find yourself frustrated after a while by the necessity of dealing with two different syntaxes for identifiers. In Pascal, case is not significant and ``←'' is used to mark word boundaries; while in Mesa, case is significant, and is itself used to mark work boundaries. This can be quite annoying when, for example, you are debugging the running Mesa from the Pascal source, and you want to find the value of some variable. To ameliorate this situation, Mike Plass wrote a special hack program called CapsArrows, which is available through PasMesa.df. Running CapsArrows adds a button to the system window at the top of the screen. When this button is left-clicked, the currently selected text is translated from caps (Mesa) format to arrows (Pascal) format; right-clicking does the reverse. CapsArrows is a useful hack, offered for whatever it is worth, but don't expect perfection. For example, it will blow up if the selected text spans node boundaries.
2. Record length. Mesa puts a limit of 4096 words on the length of a record, and construes local and global frames to be records. Most Pascal systems have no such limitation.
For an array variable declared in the global frame or a local frame, a situation that seems to arise all the time, the solution is to declare that array to be allocated from the heap instead..
A general solution for records would be really complicated, because it involves introducing new pointers within the record, and one is confronted with allocating, freeing, and copying indirectly-referenced sub-records. This situation seems to arise mostly when the programmer wants to read/write a heterogeneous binary file from/to an earlier/later Pascal program execution. He does this by declaring a huge record type, and a file whose records are of that type, and reading/writing one record. At this point your only hope is that he localized this activity in one place, so you can change it easily.
3. Integer size. In the Cedar PasMesa, the type PascalInteger is defined to be INT by the PascalBasic portion of the runtime package, where INT is a 32 bit integer. Be warned that INT's are not yet fully supported by the Mesa compiler. For example, you can't have a subrange type of INT whose representation would exceed 16 bits.
4. Precision of integer arithmetic. Be also warned that the current Mesa compiler behaves somewhat unpleasantly as regards arithmetic. If you add two 16-bit quantities, the Mesa compiler will compile a 16-bit add, blithely assuming that the sum will fit into 16-bits as well. Furthermore, there is no overflow checking at runtime. This caused us difficulties during the port of TeX, so I implemented the following dodge. PasMesa now keeps track of the compile-time upper and lower bound of all scalar quantities. When PasMesa is compiling an arithmetic expression, it does interval arithmetic on these bounds. If the result can be guaranteed by this interval arithmetic to fit into 16 bits, PasMesa compiles Cedar code that will compile a 16-bit operation; else, PasMesa coerces one of the arguments to INT to force the arithmetic to be done long.
5. Sets. I have implemented Pascal sets as Mesa packed arrays of BOOLEAN. There are three basic sizes: 16 elements, 64 elements, and 256 elements. PasMesa chooses the next larger size. Most of the 16-bit operations are inline single-word BOOLEAN operations, and should be quite fast. You are hereby warned that Alto Mesa 6 as of this date mis-compiles references to packed array of BOOLEAN beyond the 256th word of a record. (But the bug is apparently fixed in the Alto Mesa 6.1 compiler.) [I don't know about the Cedar compiler in this respect.] Moral: keep sets close to the beginning of records/frames.
6. Go To's. (Ramshaw) The GOTO statement in Pascal is of the traditional variety; but the GOTO statement in Mesa is quite restricted, since it always forces an exit from some enclosing block. Let us first consider local GOTO's, that is, GOTO's that do not jump out of a procedure body. PasMesa will insert additional blocks and loops into the translated Mesa program as necessary, in order to handle most combinations of forward and backward local GOTO statements in the Pascal input. There is one thing that PasMesa won't handle, however. Call a pair of GOTO statements head-to-head if one of them jumps backward while the other jumps forward into the loop formed by the backward jump; that is, the two GOTO statements of a head-to-head pair jump into each other's interiors. As long as the Pascal input does not include any head-to-head pairs of GOTO's, PasMesa will output correct and equivalent Mesa code. Any head-to-head pairs that do occur in the Pascal source will engender translation failures in PasMesa that will reveal themselves as error messages from the Mesa compiler.
Some Pascal implementations also allow non-local GOTO's, that is, GOTO statements that jump all of the way out of a procedure body to a label defined in some enclosing block. Consider a non-local GOTO from the point of view of the target label L and the block B in which it is defined. The GOTO statement itself doesn't appear as a statement of the block B (that would be a local GOTO). Instead, some statement in B calls a procedure P, and the GOTO is either a statement in the body of P or in the body of a procedure that P calls. We will define a non-local GOTO to be either forward or backward depending upon whether control goes forward or backward in the block B after it manages to work its way out of the invocation of P. PasMesa handles forward non-local GOTO's by using the Mesa signal/error machinery. But backward non-local GOTO's are a problem. They will be translated into Mesa code that will compile without error, but any attempt to execute the backward non-local goto will result in an uncaught signal. I suspect that handling horrible errors is the only good excuse for non-local GOTO's in any case, and error-handling GOTO's in general jump forward. Thus, I hope that PasMesa's current problems with backward non-local GOTO's will not prove to be a major difficulty.
If you would like to learn more about how PasMesa achieves these wonders with gotos, you might enjoy reading my blue-and-white report on the topic, entitled ``Deftly Replacing goto Statements with exit's,'' which is, I believe, number CSL-83-10.
7. Local Files. I don't implement local file variables (that is, a file variable that is local to a procedure invocation); all files must be global.
8. Subranges. In most Pascals, it is permissible to send an integer declared to be in the range [2..3] to a VAR parameter of type INTEGER. This is illegal in Mesa for two reasons. The first is that it permits unconscious breaching of the Mesa type system, and bounds checking is nearly impossible. Worse yet, Mesa represents its subranges in biased form, so you would get the wrong answer. This Pascal feature is used fairly often, so I compile a Mesa LOOPHOLE if the biases are compatible. This disables tight type checking, and you may want to fix the Pascal program so that it only sends variables of type X to VAR parameters of type X.
9. Program Headers. I ignore the files specification part of the program header. In particular, I don't do an automatic RESET or REWRITE on any files mentioned in the header.
10. Character set. The character set is full Ascii. That means that string constants in your program will appear in the case in which you typed them (except for the CAPITALIZE option; see above), and that when reading an input file you will see every non-control character exactly as it appears in full Ascii. Control characters are turned into blanks except for CR, which turns into a blank coupled with the EOLN condition (I/O is modified by the control field of the record representing a text file; see above).
11. Text I/O. PasMesa only supports text I/O of integers, reals, characters, strings, and booleans. It doesn't support other enumerated types, or sets, or arrays, or records. There are fairly simple ways of getting around this in either Pascal or Mesa.
12. Depth of Procedure Nesting. There were three bits free in a word in the symbol table, so Mesa limits this depth to 8, minus a few. I have encountered programs that exceed this limit by two, and I suppose worse is possible. I know this is a silliness in Mesa, but I suggest that you fix the Pascal source code.
13. Global Variables. (Ramshaw) Pascal programs frequently include many global variables, and the current Mesa compilers place fairly tight limits on the number of variable interface items that are allowed in a DEFINITIONS-type .bcd. PasMesa's modularization mechanisms discuessed above are enough to handle this problem, since you may declare more than one DEFINITIONS module in the .mod file, and parcel out the global variable among them by name. But this is sometimes tedious. If you are willing to be courageous, there is an easier way: importing a PROGRAM module. (This is nefarious process associated with the name POINTER TO FRAME.) It is legal in Mesa, although unusual, to have one PROGRAM module import another PROGRAM module instead of a DEFINITIONS modules. At the price of introducing a compilation dependency between the two PROGRAM modules involved, this allows the latter module to get at all of the variables, types, and procedures of the former one. Furthermore, PROGRAM-type .bcd's are allowed to have more variable items than DEFINITIONS-type .bcd's. To avail yourself of this technique, simply declare a PROGRAM module in the .mod file to which all of the variables are sent (with $other←vars$, presumably), and then list this PROGRAM module as an import when you declare the rest of the PROGRAM modules.
If you decide to go this route, be warned that some Mesa debuggers get confused about this kind of importation; if you ask about an unqualified global variable name, you may get an incorrect answer. In conversations with such debuggers, you will have to qualify all global variable names with the name of the module that defines them.
14. External procedures. For performance reasons, or simply for convenience, it is sometimes helpful to implement some of the basic operations of a Pascal system directly in Mesa, rather than in Pascal. For example, you might want to invoke Cedar graphics primitives from a Pascal program, or you might want to do input and output more efficiently than the standard Pascal runtime support allows. To do so, PasMesa allows you to declare procedures in the outermost block of the Pascal source simply as ``external''. PasMesa will generate a correct Mesa header for each external procedure, and put that header into the DEFINITIONS module that you specify. It will also put a copy of the header without the required body into the PROGRAM module that you specify. You are expected to take this PROGRAM module and edit it as necessary to supply a body for each external procedure.
It would be quite tedious if PasMesa turned around and smashed your carefully hand-written procedure bodies with another copy of the header hints the next time that it was run. To avoid this, I suggest that you specify a funny extension, such as ``.hint$'', for the PROGRAM modules onto which PasMesa is to write these header hints.