Page Numbers: Yes X: 530 Y: 10.5" First Page: 1 Not-on-first-page
Margins: Top: 1.3" Bottom: 1"
Heading: Not-on-first-page
Micro: Machine-Independent MicroAssembler11 July 1980
MICRO
MACHINE-INDEPENDENT
MICROASSEMBLER
11 July 1980
by
Edward Fiala
Peter Deutsch
Butler Lampson
Xerox Palo Alto Research Center
3333 Coyote Hill Rd.
Palo Alto, CA. 94304
Filed on: [Maxc1]<AltoDocs>Micro.Press
Sources on: [Ivy]<DoradoSource>MicroMemo.Dm
This manual describes a machine-independent microassembly language originally developed for the Maxc1 computer and since used for the Maxc2, Dorado, and D0 computers as well as for several smaller projects.
This manual is the property of Xerox Corporation and is to be used solely for evaluative purposes. No part thereof may be reproduced, stored in a retrieval system transmited, disseminated, or disclosed to others in any form or by any means without prior written permission of Xerox.
TABLE OF CONTENTS
l.Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.Assembly Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.Assembly Listings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.Cross Reference Listings . . . . . . . . . . . . . . . . . . . . . . . . . . 8
6.Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10
7.1Builtins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11
7.2Defining Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . .11
7.3Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
7.4Neutrals and Tails . . . . . . . . . . . . . . . . . . . . . . . . . . .14
7.5Clause Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . .16
7.6Treatment of Arguments . . . . . . . . . . . . . . . . . . . . . .16
7.7Undefined Symbols . . . . . . . . . . . . . . . . . . . . . . . . .17
7.7.lDestination Addresses . . . . . . . . . . . . . . . . . . . .18
7.7.2Octal Numbers . . . . . . . . . . . . . . . . . . . . . . . . .18
7.7.3Literals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
8.Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18
9.Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19
10.Neutrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
11.Fields, Assignments, and Preassignments . . . . . . . . . . . . . . .20
12.Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
13.Memories, Addresses, and Stores . . . . . . . . . . . . . . . . . . . .21
13.1.Target Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
13.2.Default Statement . . . . . . . . . . . . . . . . . . . . . . . . . . .23
13.3.Post Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
14.Repeat and While . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
15.Select . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
16.Bit Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
17.Multi-Statement Conditionals . . . . . . . . . . . . . . . . . . . . . . .25
18.Trace Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
Appendix 1. Micro Error Messages . . . . . . . . . . . . . . . . . . . . . . .27
Appendix 2. Limitations of the Language . . . . . . . . . . . . . . . . . .29
Appendix 3: Binary Output Format . . . . . . . . . . . . . . . . . . . . . .32
Table l: Builtins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
Table 2: Top Level Evaluation . . . . . . . . . . . . . . . . . . . . . . . . .16
Table 3: Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
Table 4: Micro Binary Output File Format . . . . . . . . . . . . . . . .32
l. Introduction
This document describes MICRO, originally implemented in 1971 for NOVA in Algol to assemble microprograms for the Maxc1 microprocessor. It has since been reimplemented for Alto in Bcpl and is now used to assemble microprograms for Maxc1, Maxc2, Dorado, and D0. Its output format is compatible with the MIDAS loader/debugger, for which there are versions on each of these four machines.
Micro is a rather unspecialized one-pass assembler. It does not know anything specific about the target machine, but instead has a general facility for defining fields and memories, a standard string-oriented macro capability, and a rather unusual parsing algorithm which allows setting fields in memories in a natural way by defining suitable macros and neutrals with properly chosen names.
This document will be of interest primarily to someone who is going to define a new assembly language for some machine. There are a number of complications inside Micro that this person must be aware of when defining the language. However, once the language has been appropriately defined, the interface seen by someone writing programs for a target machine is natural and simple.
In other words, if you were going to write microprograms for Dorado or D0, for example, you would need to read "The Dorado Microassembler" or "The D0 Microassembler", which define languages for those machines, but would probably not require this document.
2. Assembly Procedures
To assemble microprograms on your Alto, you must obtain [Maxc]<Alto>Micro.run or [IVY]<Alto>Micro.run. In addition, you will need to get the definition file(s) for the particular microlanguage that you will be using (see other relevant documentation).
Micro flushes Bravo trailers, so you can use Bravo formatting in the preparation of microprograms. However, MCross, a Maxc program that produces cross-reference listings of Micro programs, does not ignore Bravo trailers, so you may not use any Bravo formatting features if you are going to use MCross. In addition, error messages produced during assembly have line numbers that will be more difficult to correlate with source statements if automatic Bravo line breaks occur in the source text rather than explicit carriage returns.
We recommend use of GACHA8 (i.e., a relatively small fixed pitch font) for printing hardcopy microprogram listings, and the use of GACHA10.AL for editting microprograms with Bravo. Bravo tab stops should be set at precisely 8 character interals for identical tabulation in Bravo and MCross.
The two relevant lines in USER.CM for Bravo are:
FONT:0 GACHA 8 GACHA 10
TABS: Standard tab width = 1795
You will probably want to delete the other Font lines for Bravo in USER.CM.
Suppose that you have prepared a language definition file LANG.MC and a number of source files for assembly by Micro. Then a microassembly is accomplished by the following dialog with the Alto Executive:
MICRO/L LANG SRC0 SRC1 ... SRCn
This causes the source files "LANG.MC", "SRC0.MC", ... , "SRCn.MC" to be assembled. The binary output and symbol table at the end of assembly are written onto "SRCn.MB" and "SRCn.ST", the error messages onto "SRCn.ER", and an assembly listing onto "SRCn.LS".
In other words, Micro assembles a sequence of source files with default extension ".MC" and outputs four files whose extensions are ".MB", ".ER", ".LS", and ".ST". The default name for these is the name of the last source file assembled. Direct output to particular files as follows:
MICRO SYS/L/B LANG SRC0 SRC1 ... SRCN
This would cause listing output to be put on "SYS.LS" and symbol table and binary output onto "SYS.ST" and "SYS.MB".
A summary of the local and global flags for Micro is as follows:
Global:/Lproduces an expanded listing of the output
/Nsuppress binary output
/Osuppress symbol table output
/Uconvert text in all source files to upper case
Local:/Rrecover from symbol table file.
/Lput expanded listing on named file
/Bputs binary output and symbol table output on named file with extensions .MB and .ST, respectively. Default error listing to named file.
/Eput error listing on named file
/Sput symbol table on named file
/Uconvert text in named file (and any file which it INSERT’s) to upper case
Local flags override global ones.
INSERT[file];
statements may be put into source files so you don’t have to type as many source names on the command line. This is exactly equivalent to the text of file.mc. INSERT’s may be nested to a reasonable depth. However, although INSERT saves typing it is slower than putting the file names on the command line because Micro uses a fast file-lookup routine to get handles on every file named in the command line in about 1 second; each INSERT adds an additional 1 second for file name lookup.
Another shortcut is to define a command file MI containing "Micro/O/U LANG" or whatever and then type @MI@ SRC0 ... SRCN", which avoids some typing.
The SETMBEXT[.ext] builtin allows the binary output file extension to be changed from .MB to something else. This declaration has to be assembled before defining any memories (else the output file will have already been opened with extension .MB). The Dorado and D0 microassemblers use this to change the extension to .DIB, as expected by the postprocessor, MicroD.
Micro creates a temporary file Micro.fixups and deletes it at the end of assembly. If you abort assembly with shift-swat, you may delete it yourself.
Micro’s binary output is generated in one pass and consists of memory definitions, store directives to memories, forward and external reference fixup directives, and new or changed address symbols for each memory. The block types written on the output file are given in Appendix 3.
Micro assembles declarations at a rate of about 60 statements/second and, with typical microlanguages, assembles microinstructions at about 7 statements/second. On very large assemblies this rate slows slightly as the symbol table grows larger. The assembly time for the Maxc system microcode is about 7 minutes (~2000 72-bit microinstructions, ~500 36-bit words in other memories, ~500 definitions, and ~1400 addresses).
Comments are flushed very quickly by the prescan, so do not worry about a profusion of comments slowing assembly.
Presently, the Micro-Midas system has no provision for relocating independently assembled source programs. However, the Micro symbol table is dumped onto a file at the end of the assembly. Later, assembly can be continued at that point onto another binary output file, thereby reducing assembly time. For example, you can build a LANG.ST file as follows:
MICRO LANG
Then do all further assemblies as follows:
MICRO/OU LANG/R SYS/B SRC0 ... SRCN
This saves a little assembly time but still does not allow several people to independently maintain sources used in a common system.
To avoid reassembling unchanged files, one would have to partition his program into separate assemblies, each of which used absolute location-counters for the various memories. This would be difficult, probably not as good as reassembling everything. However, if this were done, Midas could link external references between the different modules at load time.
The MicroD program, used to post-process Micro assemblies for Dorado and D0, has limited provisions for relocation. Programmers using the Dorado or D0 microlanguages should read the relevant documentation.
3. Error Messages
During assembly, any error messages are output both to the terminal and to the error file. If an assembly listing is being printed, the error messages are also printed there.
As Micro churns through the source files it prints the name of each on the error file (and terminal), and when INSERT[file] statements appear it outputs "* FILE file ..." and "* RETURN to file" messages. These will pinpoint any error message to a particular source file.
Micro error messages are in one of two forms, like the following:
statement
218...error message
statement
TAG+39...error message
The first example indicates an error in a statement beginning on the 218th line of the source file. This form is used for errors that precede the first label in the source file. The second form is used afterwards, indicating an error on the 39th line after the label "TAG". Micro also prints the source statement causing the error before printing the error message.
Note that the line count measures carriage returns in the source, so if you are using Bravo formatting in the source files, you may have trouble distinguishing carriage returns from line breaks inserted by Bravo’s hardcopy command.
ER is the builtin by which a Micro program outputs fatal and non-fatal errors, warnings, and other messages.
ER[message,stopcode,value]
Blanks are squeezed out of the message argument (a literal string) by the prescan so "-" "." or other printing characters should be used instead of blanks in the message.
Stopcode equal 0 is used for progress messages; 1 for fatal errors; 2 for ordinary errors; 3 for warnings. A nulstring in the stopcode defaults to 0. Assembly continues from the error except when a fatal error (stopcode=1) is evaluated. Ordinary errors are counted together with Micro builtin errors; warnings are counted separately. At the end of assembly the error and warning counts are printed on the Alto display; only when these counts are non-zero do you have to look in the .ER file for details.
ER first prints the source statement and then the message in one of the two forms given above; then, if the value argument is present, it is evaluated (e.g., it may be an IP or other arithmetic expression) and printed in octal; then, if stopcode is 1, assembly is aborted.
When the assembly is not aborted, assembly of the statement in which the error occurred will continue from the point of the error. This may result in more error messages if the assembler gets confused by an undefined symbol or some other condition. The location counter gets incremented iff at least one store is done by the statement, so a statement with an error may still generate an output word, or it may not.
A summary of Micro builtin error messages is given in Appendix 1.
4. Assembly Listings
An expanded listing is produced only when either the global or local /L option is selected. When the listing file is being produced, the information output is controlled independently for each memory by the LIST builtin.
LIST[memory,mode]
controls assembly listing output for all stores to the selected memory. The value of mode is bit-encoded as follows:
1enable listing of stores in the memory as octal numbers; by default these are divided into 12-bit groups starting at the right-most bit of the value; the bit of value 20 and the LISTFIELDS builtin modify the form of the octal printout.
2list stores in the memory as field assignments;
4produce a numerically-ordered list of symbols at end of assembly;
10produce an alphabetically-ordered list of symbols at end of assembly;
20makes the octal printout divide stores into 16-bit groups.
The actions of these bits are or’ed. LIST may be given many times during the assembly, to enable/disable listing output for code sections with difficult bugs. The value of mode at the end of assembly determines whether or not numerically or alphabetically-ordered address lists are printed.
When a statement of the form:
ANAME[(TAG: mumble)];
is assembled, the listing output would be as follows:
302(TAG)NNNN NNNN ... NNNNfor mode 1
302(TAG)NNNNNN . . . NNNNNNfor mode 21
302(TAG)Fl←3, F2←34, F3←20;for mode 2
302(TAG)NNNN NNNN ... NNNNfor mode 3
Fl←3, F2←34, F3←20
Mode equals 0 disables all listing for the specified memory.
Fl, F2, and F3 in the above example represent all the fields to which explicit assignments were made during the assembly of (mumble). Fields which have non-zero values due to the action of a DEFAULT statement for the memory are not listed, nor are preassignments listed. Also, fields filled in by forward references will be erroneously listed as containing their default value.
Error messages are printed on the line after the listing of the memory word or between memory words if no field assignments were completed in the statement.
LIST[,mode], where the memory name is null, AND’s mode with the listing mode for all memories other than the target, e.g. LIST[,0] suppresses listing of all non-target memories and LIST[,3] restores.
The LISTFIELDS builtin can be used to control the assembly listing more precisely. Micro assembles
LISTFIELDS[MNAME,(clauselist)];
as a word for memory MNAME and then notes the positions of all the 1-bits in the result. Thereafter, in the octal listing for that memory, rather than each field being precisely 12 or 16 bits wide, 1-bits in the word given to LISTFIELDS are taken as the rightmost bits of the fields. For example, if the word contains 1-bits only in positions 2, 5, and 6, the octal listing will show a 3-bit field (bits 0:2), another 3-bit field (bits 3:5), a 1-bit field (bit 6), and then the rest of the word chopped up into groups of 12 or 16 bits.
The mode argument to LIST determines whether or not the stores are printed, but LISTFIELDS controls the format of the numerical printout whenever that is turned on by the mode =18 bit.
5. Cross Reference Listings
A Tenex program called MCross will parse source files according to Micro syntax and produce cross-reference listings. Several simple files must be prepared to tailor MCross for the language file being used. These files eliminate the garbage tokens that would otherwise clutter the cross-refernce listing.
A cross-reference listing is not very useful for small microprograms but becomes increasingly valuable for large systems. Consequently, if you are maintaining a large system, you will probably wish to obtain an account on our Maxc timesharing system. Occasionally, you will dump the sources on your Tenex directory and run MCross over them.
A typical dialog with MCross is given below. The program is more-or-less self-documenting and will give you a list of its commands if you type "?".
@MCross
Output file:LPT:GACHA8
Machine:D(selects Dorado syntax--M for Maxc, 0 for D0)
Action:N(read def’s, no printout)
File:LANG<cr>
Action:CL(read def’s, produce cross ref.)
File:SRC1<cr>
Action:CL
File:SRC2<cr>
. . .
Action:P(print operation usage statistics)
Action:G(print global cross reference)
Action:E
@
6. Comments
Micro ignores all non-printing characters and Bravo trailers. This means that you can freely use spaces, tabs, and carriage returns to format your file for readability without in any way affecting the meaning of the statements.
Comments are handled as follows:
"*" begins a comment terminated by carriage return.
"%" begins a comment terminated by the next "%". This is used for multi-line comments.
";" terminates a statement. Note that if you omit the ";" terminating a statement, and for example, put a "*" to begin a comment, the same statement will be continued on the next line.
Micro has a now-obsolete method of producing conditional comments which is discussed here; at one time this was used for multi-statement conditional assembly, but now conditional assembly builtins discussed later are used instead.
The COMCHAR builtin provides conditional assembly of a large block of instructions by altering the interpretation of comments.
COMCHAR[char]
makes *char be a comment bracket similar to %. Micro will discard everything from an occurrence of *char through the end-of-line following the next occurrence of *char. Note that this is not quite like % because % stops discarding immediately at its matching occurrence.
You can disable this feature with
COMCHAR[]
which is Micro’s initial state. As an example, suppose you want to assemble one of two code sequences depending upon whether some integer symbol X is zero. You could write the following:
IFE[X,0,COMCHAR[#],COMCHAR[=]];
*= here is some code to skip if X neq 0 (assemble if X eq 0)
...
*= end of X eq 0 code
...
*# here is some code to skip if X eq 0 (assemble if X neq 0)
...
*# end of X neq 0 code
COMCHAR[]; *Disable feature
7. Statements
After comments and non-printing characters are stripped out, the rest of the text forms STATEMENTS. There is no level of program structure superior to the statement (e.g., conditionals cannot span more than one statement) except for the COMCHAR kludge.
Statements are terminated by ";". You can have as many statements as you want on a text line, and you can spread statements over as many text lines as you want. Statements may be indefinitely long.
However, the size of Micro’s statement buffer limits statements to 500-decimal characters at any one time. If this is exceeded at any time during assembly of a statement, an error message is output. Since horrendous macro expansions occur during instruction assembly, overflow is a possibility, and care is required when defining complicated macros.
The special characters in statements are:
"[" and "]"for enclosing builtin, macro, field, memory, and address argument lists;
"(" and ")"for causing nested evaluation;
"←"as the final character of the token to its left;
":"to put the address to its left into the symbol table with value equal to the current location in the current memory;
","separates clauses or arguments;
";"separates statements
"#"#1, #2, etc. are the formal parameters inside macro definitions;
"01234567"are number components (all arithmetic in octal)
All other printing characters are ordinary symbol constituents, so it is perfectly ok to have symbols containing "+", "-", "&", etc. which would be syntactically significant in other languages. Also, don’t forget that blanks, carriage returns, line feeds, and tabs are syntactically meaningless (flushed by the prescan), so "P+Q" = "P + Q", each of which is a single symbol.
Micro handles all code generation by table lookup and minimal use of conditionals. In particular, it does not evaluate P+Q+l but rather looks it up in the symbol table. Since P + Q + l is the same for a human, we have chosen to suppress all blanks. Other non-printing characters are suppressed so that control characters don’t appear invisibly in print names.
Note that name length is limited only by the size of the statement buffer. However, avoid defining address symbols longer than about 13 characters because of problems you will encounter with the debugger Midas.
Statements are divided into clauses by commas. An indefinite number of clauses may appear in a statement.
Examples of clauses are:
NAME,
NAME[ARGl, ARG2, ..., ARGN],
FOO←FOOl←FOO2←P+Q+lP+Q+l is a "source" while FOO, FOOl, and FOO2 are "destinations" or "sinks."
P ← STEMP
NAME[Nl[N2[ARG]],ARG2]←FOO[X].
7.1 Builtins
All of the predefined operations of Micro are called builtins. With the exception of the BUILTIN and INSERT builtins, none of them have a priori names but instead are assigned names by the programmer. Names are assigned to builtin operations by declaration statements of the form:
BUILTIN[BUILTIN,l];
where the second argument is the intrinsic operation number and the first argument is the name by which it is referred to.
All builtins are called using this same syntax:
NAME[ARGl, ARG2, ..., ARG9];
The all-inclusive list of builtins is given in Table l. Note that the only print-names assembled into Micro are BUILTIN and INSERT; i.e., the other names in Table 1 are chosen by convention.
7.2 Defining Symbols
The builtins BUILTIN, MACRO, NEUTRAL, MEMORY, FIELD, and SET are used to define symbols of different types, as discussed later. The name of a defined memory can then be used to define addresses in that memory, and addresses are also defined when labels appear in statements being assembled for storage in a memory. Once a symbol has been defined, it is an error to redefine it as any other type of symbol.
It is legal to change the value of a symbol of type integer.
Redefining a macro is legal (but Micro prints a warning message).
When an address is defined by a label, any attempt to change its value is illegal, but when defined by MEMNAME[symbol,value] it is legal to change the integer part of the value (illegal to change the memory part of the value).
Table l: Builtins
Builtin No.NameDiscussion
lBUILTINSection 7.1
2MACROMacro definition (usually the short name "M" is used), section 9
3NEUTRALNeutral definition (usually the short name "N" is used), sections 7.3, 10
4MEMORYMemory definition, section 13
5TARGETTarget memory declaration, section 13.1
6DEFAULTDefault value of memory bits, section 13.2
7FIELDField definition, section 11
l0PFField preassignments, section 11
llSETInteger definition and set, section 8
l2ADDSection 8
l3IPInteger part of an address, section 8
l4IFSEIf-string-equals conditional, section 12
l5IFAIf-field-assigned conditional, section 12
l6IFEIf-integers-equal conditional, section 12
l7IFGIf-integer-greater conditional, section 12
20IFDEFIf-symbol-defined conditional, section 12
2lIFMEIf-memory-part-of-address-equals-string conditional, section 12
22ERPrint error message, section 3
23LISTControl assembly listing, section 4
24INSERTInsert file, section 2
25NOTSection 8
26REPEATRepeat evaluation, section 14
27ORSection 8
30XORSection 8
31ANDSection 8
32COMCHARMulti-statement conditionals, section 6
33BITTABLEDefine bit table, section 16
34GETBITSection 16
35SETBITSection 16
36FINDBITSection 16
37MEMBTSection 16
40LSHIFTLeft-shift integer, section 8
41RSHIFTRight-shift integer, section 8
42FVALGet value in field, section 11
43SELECTSwitchon integer, section 15
44SETPOSTDefine post-evaluation macro for memory, section 13.3
45--Deimplemented
46LISTFIELDSControl assembly listing, section 4
47SETMBEXTSet binary output file extension, section 2
50SUBSection 8
52ASMMODEMulti-statement conditional assembly, section 17
53TRACEMODEMacro expansion tracing, section 18
54WHILERepeat evaluation, section 14