C2CNamingDoc.tioga
Copyright Ó 1989 by Xerox Corporation. All rights reserved.
Christian Jacobi, May 9, 1989 5:46:03 pm PDT
Christian Jacobi, December 11, 1989 12:21:10 pm PST
Naming Scheme for C2C Generated C Files.
What we are trying to achieve
Naming is incredible complicated! Names should be meaningfull to debuggers; the debugger want to know names in terms of mob, but c2c does not read mob-files! On the other side, c2c should put all the information it knows into names.
There are two debuggers, whenever naming gets improved it has to be staged with two debuggers and two already compiled worlds wich should stay debuggable. A nearly impossible task.
Names must not clash; obviously. However, names may have been introduced at many different levels in the compilation. Each level in the compilation might rightfully introduce resources which need to be named somehow.
The levels of compilation which introduce resources are
0) Mesa names used by the programmer
1) Front end internal resources
2) Intermediate optimization phases internal
3) C2C itself.
Each of these levels might be split into sub levels. Also, there are some minor complications, like programmer generated names need not be unique; intermediate phases do not have knowledge of mob files...
Beside of the compilation process there are some more sources of names. The compilation process has to be carefull to avoid name clashes with those external name sources. Namely,
0) C reserved words.
1) Names for resources provided by the runtime support.
2) Names used in included C header files. Open ended !
3) Names explicitely required by the programmer. [The mesa programmer can request C2C to use a particular name for a resource in the generated .c2c.c file; machine code procs...]
It is very important to have a naming scheme which allows large freedom for names used in included C header files. That is the place where we do have the largest unknowns; we might have to include program files which have been written without knowledge of mesa, mimosa and c2c. We have to find a naming scheme which restricts compiler generated names and does not restrict arbitrary include files to hard.
Minor considerations are also
1) Keep names relatively short.
2) Some C compilers have a limit for name length.
How we are achieving it
The format of a name suggest where it came from.
No underscore:
Private temporaries at discretion of C2C. Probably of no interest to a debugger.
One underscore:
This is the general form for names a debugger might be interested in.
The general form is
[identifier][underscore][letter][number]
all four parts must occur and not be empty.
Two underscores:
The "emergency" form for names. This form is generated in few cases
1) Names in general form was not a legal C name.
2) Names in general form hase not been unique. Due to the number part, this should never occur.
3) Names in general form does not fit into maximum length C identifier.
The "emergency" form is
[identifier][underscore][letter][number][underscore][any]
The first four fields try to approximate the general form; the second underscore and the any serve to make the name unique. The value of any might give more information
I hesitate to document. As general rule for "emergency" names is that the number part is still correct, but the identifier part might be shortened or illegal letter replaced.
Other:
Probably some weird C program includes this.
More details for the interesting part with one underscore
[identifier][underscore][letter][number]
This form should be sufficiently "special" that it can be "forbidden" in C include files.
-the identifier is the mesa name.
-the letter gives information about what C2C knows about this name
-the number is the number given by the front end. Numbers are decimal.
=positive values are encoded as number.
=negative values are encoded with a leading zero and following the absolute value.
=the number zero is encoded as just a zero.
[**rumor monging: negative values for variable numbers might mean: variable is not described in mob file.]
Oops, I'm a level off; your interest is not documentation of C2C but how to write a debugger...
C2C gets a hash table with names and it gets a code tree as input. I sometimes forget that the code tree comes from the front end too. That leaves the following cases.
-Hash table contains number and name.
-Hash table contains number and no name.
-Hash table contains no number but name: impossible.
-Number appears in code tree but not in hash table.
C2C uses the letter to differentiate those cases, and, to differentiate about more available information.
Letter
P  Name for a lambda; name and number appear in hash table.
  Probably a procedure.
Q  Name for a lambda; number appears in hash table, but no name.
  Rare?
L  Name for a lambda; neither name nor number appears in hash table.
  But number is in code tree nevertheless!
  Probably a label.
v  Name for a variable; name and number appear in hash table.
w  Name for a variable; number appears in hash table, but no name.
  Rare?
c  Name for a variable; neither name nor number appears in hash table.
  But number is in code tree nevertheless!
IMPORTANT
I will fix problems when they are reported; there is no need to document all internal details, but maybe not enough details are documented.
Please refrain from usage of experimental findings without reporting me; Not documenting is used as a method to keep free room for future changes and improvements.
Expected changes
The code tree recently got enhanced to contain more information about lambdas. I would like to encode all information I have, but can't do it now because I must not disrupt SSU's debugging. The information I would like added is knowledge of
normal outer-level procedures
normal inner-level procedures
installation procedures
initialization procedures
catch procedures
enable scope procedures
fork base procedures
--------------------------------
OLD documentation, kept around if clarification would be needed.
Features
Name space is cut in two to separate names available for include files from names asssigned by the compiler.
Name contains number which is given by front end.
Name contains mesa identifier if one is available.
Name cut to maximum length allowed by C compiler.
Names carry information whether named object is a variable or a procedure.
Names carry information whether name was assigned from Mimosa front end or not.
If name is cut to fit number of character; the cut will leave the (sci?) number unmodified, but will cut the other part.
If name is cut or other conflict occurs, name will carry that information.
Name space is cut in two to separate names available for include files from names asssigned by the compiler.
C reserved words: avoided by C2C
Names which have no underscore: Private temporaries at discretion of C2C
Names which look like
IdentifierUnderscoreLetterNumber and
IdentifierUnderscoreLetterNumberUnderscoreAny are used by C2C
Names which look like
IdentifierUnderscoreNumber and
IdentifierUnderscoreNumberUnderscoreAny are reserved by C2C
Names which look different are reserved for include files and external names.
Names carry information whether named object is a variable or a procedure.
Names carry information whether name was assigned from Mimosa front end or not.
IdentifierUnderscoreLetterNumber
P Procedure with mesa name: Identifier = mesa name
Q Procedure or label without mesa name: Identifier at discretion of C2C
L Procedure or label without mesa name: Identifier at discretion of C2C
v Variable with mesa name: Identifier = mesa name
w Variable without mesa name: Identifier at discretion of C2C
c Variable without mesa name: Identifier at discretion of C2C
If name is cut to fit number of character; the cut will leave the number unmodified, but will cut the other part.
If name is cut or other conflict occurs, name will carry that information.
IdentifierUnderscoreLetterNumber regular name
IdentifierUnderscoreLetterNumberUnderscoreAny cut or conflict occurred
Review
Programmer introduced names are now clearly recognizable as such, but on the other side: IntCodeTwig introduced names look like Mimosa introduced names and like C2C preprocessing introduced names.