C2CInterLanguageDoc.tioga
Copyright Ó 1988, 1993 by Xerox Corporation. All rights reserved.
Christian Jacobi, July 13, 1988 8:58:50 pm PDT
Christian Jacobi, April 20, 1993 2:13 pm PDT
Calling C procedures from Cedar
Trusted Machine Code Procedures
Usage of a prefix and underscore is recomended; otherwise the name could crash with a compiler generated name.
Also dont use names with exactly one underscore followed by a single letter and then digits; the compiler generates those too.
The ← character in Cedar is an underscore.
Warning: not all features are ratificated by the language design commitee; some features might change, but it seems unlikely.
It might be good to stick to the "Simple", the "Complex" and the "ExternalNames" example. The "Crazy" examples and the "More complete syntax" contain features of questionable value.
Simple examples
1) Include just a plain C procedure
PutChar: PROC [ch: CHAR] = TRUSTED MACHINE CODE {
"Mumble"
};
generates
Mumble(ch)
However, normally you have to get a declaration of the procedure you are calling, so this case might be a little bit oversimplified.
2) Include a C procedure with an include file from the standard Cedar place
PutChar: PROC [ch: CHAR] = TRUSTED MACHINE CODE {
"Foo.Mumble"
};
generates
#include <cedar/Foo.h>
...
Mumble(ch)
This case is made to look like in Cedar external procedure; e.g. the extension .h for the include file need not be specified.
3) Include a C procedure with an include file from the standard Unix place
PutChar: PROC [ch: CHAR] = TRUSTED MACHINE CODE {
"<Foo.h>.Mumble"
};
generates
#include <Foo.h>
...
Mumble(ch)
Here we are thinking Unix; specify a unix h file including the .h.
4) Include a C procedure with an include file from the working directory
PutChar: PROC [ch: CHAR] = TRUSTED MACHINE CODE {
"""Foo.h"".Mumble"
};
generates
#include "Foo.h"
...
Mumble(ch)
Here we are thinking Unix; specify a unix h file including the .h.
Complex examples
DRealAddI: UNSAFE PROC [ret, x, y: PDREAL] = UNCHECKED MACHINE CODE {
"+extern void XR𡤍RealAddI (ret, x, y) W2 *ret, *x, *y; {\n";
" DRealPtr(ret) = DRealPtr(x) + DRealPtr(y);\n";
" };\n";
".XR𡤍RealAddI";
};
Note that machine code procedures can use multiple lines.
The "+" denotes that the following stuff is included after the type declarations of the module.
In this exampole DRealPtr needs to be definerd:
DefDRealPtr: PROC = TRUSTED MACHINE CODE {
"+#define DRealPtr(x) (*((double *) (x)))\n."
};
Here we note that there is no actual procedure to be called: the machine code text has nothing following the period. This machine code procedure was thought to be called in the module initialization to provide common definitions.
Crazy examples
Using assignments and declarations
Avoid usage of this feature for assignments and declarations; it is quite baroque
E.g. for the Unix errno feature [Its brain damaged to use global variables, but Unix users have no choice]. BTW: it is better to use the runtime feature than to try to do this with machine code procedures. The runtime feature is much more correct. E.g, the runtime feature knows how to deal with multiple threads, whereas this method does not.
ErrNo: PROC [] RETURNS [INT] = TRUSTED MACHINE CODE {
--call a variable, and, make sure it is declared (extern)
"!$errno"
};
ClearErr: PROC [] = TRUSTED MACHINE CODE {
--include this statement as is
"@errno = 0"
};
EvenMoreUgly: PROC [] RETURNS [INT] = TRUSTED MACHINE CODE {
--make a declaration as is, e.g. to fool C2C's types
--and, make a call to this variable
"+extern int evenMoreUgly;.$evenMoreUgly"
};
P: PROC [] = {
i, k: INT;
ClearErr[];
i ← ErrNo[];
k ← EvenMoreUgly[];
};
generates
extern word errno;
extern int evenMoreUgly;
...
errno = 0;
i� = (word) errno;
More complete syntax
You wouldn't believe how archaic machine code procedures are parsed. It is better to stick to the well supported examples than to understand the complete parsing algorithm. When new features need to be included I intend to keep the simple examples correct, but can't make any guarantees about the general algorithm.
First, split the text into the piece for the declarations [prefix part], and, the piece to be included in line [procedure name part]; this is done at the position of the rightmost dot. To the left of the dot is the prefix part, to the right of the dot is the procedure name part.
prefix part syntax:
We introduce the entry; thats the unit used in caching; Each entry is cached and handled only once; this is used to get only single includes. Users must be carefull and use exactly the same spelling if two machine code procedures have same entries.
consume leading %: it says that each line is a separate entry
other leading letter: this starts the single entry for the whole prefix part
entry:
entries are cached: the same entry is included only once into a C file; An entry may be multiple lines; separated by either /n or actual lines in the source.
leading ": makes an #include "....
leading <: makes an #include <....
leading *: include the rest of the entry before the type declarations
leading +: include the rest of the entry after the type declarations
leading ~: include .h file from standard Cedar place
leading =: include .h file from standard Xr place
alphabetic leading letter: think cedar module name; adds include .h file from standard Cedar place.
"adds include .h file" means: Rope.Cat["<standard—place/", entry, ".h>"]
procedure name part syntax:
leading !: make line to declare the name [extern]
leading &: don't use (void), even if procedure has no return parameters
leading $: no paranthesis, e.g. for constants, variables
leading @: no paranthesis, don't use (void) e.g. for constants, variables
leading :: stop consuming further leading letters [so reserved characters can be used in the name]
alphabetic leading letter: this starts the name
Calling Cedar procedures from C
The "problem"
1) Make the generated C procedure have a C name under programmer control.
2) Make sure the "signature" calling sequence of the Cedar procedure matches the C procedure.
There are currently two methods to force programmer defined procedure names.
a) Use TRUSTED MACHINE CODE to specify inside the program what the names should be
b) Use an external file to define the names of the procedures.
As of today method b) has fallen out of my favor. Avoid this because I might retract it when I can make sure all uses are fixed.
Warning
This mechanism makes Cedar procedures available for extern use without enforcing the module initialization to run first. However, correct behaviour of Cedar procedures may depend on doing the module initializations.
a) Name definition using TRUSTED MACHINE CODE procedures
Example:
ExternalNames: PROC [] = TRUSTED MACHINE CODE {
"^ExternalNames\n";
"Xyz  XR←Xyz\n";
"Foo  XR𡤏oo\n";
};
In this example, the trusted machine code procedure ExternalNames must be used, so C2C will actually see it in the code tree.
The line "^ExternalNames\n"; specifies that this machine code procedure has the purpose of defining C names.
This makes the Cedar procedure Xyz have the C name XR←Xyz; as well as the Cedar procedure Foo have the C name XR𡤏oo.
It is required [but not tested] that all CedarProcedureName's are existing top level procedures.
It is an error if multiple procedures [even if not top level] with name CedarProcedureName exist.
Design rational
The external name can be explicitely specified [and is not simply a translation of the Mesa name]. This allows to make "exportable" names with prefix and underscore.
b) Name definition using external file
Avoid this because I might retract it. The problem is, that usage of an external file needs a special switch and knowledge by several programming environment tools.
Uses -R switch to mimosa to make exteRnal procedure declarations.
This causes reading in a modulename.externalProcs file describing external names of procedures. Only the procedures described in .externalProcs file will get external names.
Syntax of .externalProcs files
File is line oriented
Lines starting with "-" are comments
The first non comment line must be
extern procs
All other non comment lines have the form
CedarProcedureName CProcedureName
The CedarProcedureName must have the syntax of a Cedar name.
The CProcedureName must have the syntax of a C name.
Syntactical Restrictions
It is required [but not tested] that all CedarProcedureName are existing top level procedures.
It is an error if multiple procedures [even if not top level] with name CedarProcedureName exist.
Module name is used instead of file name to find .externalProcs file. [As long as C2C does not know file names]
Questionable
Are the names of features part of the mesa program or not? Logically they might not have anything to do with the mesa program, but then why did we care in first place.
Comparison with Custer [see Custer's own doc]
Custer allows a C program to simulate the Cedarish import mechanism; it requires the C program to be specialy written for this purpose. Custer has the advantage of full Cedarish re-importing possibilities.
This simply exports Cedar procedures the C-ish way; no version checking or reloading is possile. Its advantages are less work and, it works for unmodified C programs not knowing that they call Cedar procedures. By not requiring to modify the caller, this mechanism may also be used from Languages other than C.
Other Interoperability Hints
See the modules UXProcs, UXStrings, GCRoots.
UXProcs contains procedures to transfer between Mesa procedure values and C procedure values. This module hides the fact that procedure values in Mesa have one level of indirection more then procedure values in C.
UXStrings contains procedures to transfer between Cedar ropes and Unix strings. The problems are the immutability requirements of Cedar ropes and the null termination of unix strings.
Ansi-C Headers
Use the ^ArgumentCast feature
If a trusted machine code procedure has a line beginning with ^ArgumentCast, the rest of this line is used as a cast for an argument. The n'th occurence of ^ArgumentCast describes the cast for the n'th argument.
Example.
FooBar: PROC [i: INT, p: REF INT] RETURNS [x: INT] = TRUSTED MACHINE CODE {
"^ArgumentCast (int)\n";
"^ArgumentCast (int*)\n";
"FooBar"
};