CirioReverseEngineering.tioga
Copyright Ó 1991 by Xerox Corporation. All rights reserved.
Spreitze, January 21, 1993 4:40 pm PST
Introduction
Herein are recorded all the deep, dark, unpublished secrets that Cirio uses to get its job done.
Mimosa
Static Links & Body Table Structure
For an exceedingly normal procedure nested inside a procedure, there appears in the local context of the inner procedure a variable (in both the .mob and the .c2c.c) whose contents (after the first few instructions of the procedure have been executed) are the address of the outer procedure's frame extension.
Sometimes Mimosa misplaces the frame extension variable in the .mob --- it appears in the NIL context rather than the appropriate procedure's context; in the C code, however, the variable appears amongst the procedure's locals. Global link variables and static link variables are also sometimes misplaced in the .mob.
For a procedure nested inside a procedure, when the inner procedure has no local symbols, Mimosa misplaces the static link variable in the .mob --- it appears in the chain of id SE's linked under the NIL context. [Note also that all static link variables have their context backpointer set to the NIL context.] The C variable appears as in the normal case.
For an ENABLE scope nested inside a procedure, the protected scope and the catch series each appear in the .c2c.c as separate procedures and in the .mob's body table as separate callable bodies, siblings of each other, children of the containing body. If the ENABLE scope declares any local variables, an internal body, child of the ENABLE scope body, is created to contain them; the ENABLE scope body itself always has a NIL local context. That child seems to always appear last amongst the children of the ENABLE scope body. The static link for an ENABLE scope never appears in the .mob. In the .c2c.c, it is the sole parameter to the ENABLE scope's C procedure.
For a catch series nested inside a procedure, there are exactly 5 parameters to the C procedure, and none of them appear in the .mob. Carl Hauser knows what these five parameters are.
For a procedure nested inside an ENABLE scope, the procedure's body appears as a child of the smallest enclosing non-catch/ENABLE body, rather than as a child of the ENABLE scope body. ...?
For an ENABLE scope nested inside an ENABLE scope...?
For a catch series nested inside an ENABLE scope...?
For a procedure nested inside a catch series...?
For an ENABLE scope nested inside a catch series...?
For a catch series nested inside a catch series...?
Sun's SUNOS cc (data structures)
Sun's SUNOS cc & ld (symbols)
In an elementary a.out (ie, one produced by compiling a single source), all the linker symbols preceed all the debugger stabs. The exported and non-exported linker symbols are intermixed.
In a compound a.out (ie, one produced by ld from several others), all the exported linker symbols appear at the end. This final sequence is the concatenation of the exported linker symbol subsequences of the component a.outs. The other (non-linker or non-exported) symbol sequence in the compound a.out is also the concatenation of the other symbol subsequences of the component a.outs.
a.out (gcc & Sun's bundled SUNOS cc)
The relation between the C source structure and the stabs is obscure.
For gcc, the SLINE stabs all appear before the FUN stab. For Sun's bundled cc, they are intermixed with other stabs in a less suprising way. The rest of this section ignores the SLINE stabs.
Sun's bundled cc omits an LBRAC & RBRAC pair that does not directly contain declarations; gcc omits an LBRAC & RBRAC pair that does not recursively contain declarations. Both compilers emit declarations before the LBRAC & RBRAC pair that contains them.
The parameters to a procedure are declared after the FUN stab. For parameters put into registers, Sun's bundled cc emits a PSYM and an RSYM stab; gcc emits only the RSYM stab.
Local register variables provoke only one stab from both compilers.
Both compilers emit declarations for structure & union names and also for typedef-ed names --- both emit two declarations when a name is declared in both ways. Both compilers use LSYM stabs for both kinds of declarations. The two kinds of declarations can be distinguished by looking at the ``symbol descriptor'' (the first character after the `:') in the stab's string: `t' for one kind, `T' for the other.
When a type has to be constructed for a local variable, gcc uses a new typeref (number or number pair) for every distinct scope; Sun's bundled cc shares typerefs across a whole module.
The scope for an LSYM or RSYM is thus a bit tricky to determine, because such a stab could describe an argument or a global symbol [sic]. For an LSYM, one could search for the next bracket or FUN stab; if a LBRAC is found, the LSYM is local to that bracket pair's scope; if a FUN stab is found, the LSYM is global to the module; if an RBRAC is found, this documentation is buggy. For an RSYM without a matching PSYM, the ``symbol descriptor'' in the stab's string must be consulted.
Thus, here's how the sequence of non-SLINE stabs can be parsed:
<module> = <glob>* (<fun> <glob>*)*
<glob> = GSYM | LCSYM | STSYM | LSYM
<fun> = FUN (PSYM | RSYM | PSYM[name] RSYM[name])* <bracket>*
<bracket> = (STSYM | LSYM | RSYM)* LBRAC <bracket>* RBRAC
(Note the hack of using <bracket>* in a <fun> to summarize all the ways intermediate bracket pairs might be omitted.)