Heading:qjk40(635)SUBJECTy756qjk40\1bi6BIPage Numbers: Yes  X: 527  Y: 10.5"qjk40DRAFT - DRAFT - DRAFT - DRAFTz18592l4445y748c\f5bDRAFT - DRAFT - DRAFT - DRAFTz18592l4445y14c\f5bInter-Office Memorandumz18592l4445y762\f5bTo	Mesa Programmers	Date	November 20, 1980z18592l4445d2998e21(0,65535)(1,4445)(5,11684)(6,14146)\f1 2f0t2 1t0 16t6 1f1t0 4f0t7 1t0From	Dick Sweet	Location	Palo Altoz18592l4445d2998y716e25\f1 4f0t2 1t0 10t6 1f1t0 8f0t7 1t0Subject	Writing Mesa programs for efficiency	Organization	OPD/SDD/SSz18592l4445d2998e25\f1 7f0t2 1t0 36t6 1f1t0 12f0t7 1t0Filed on: <Sweet>EfficientCoding.bravoe30(2116)e10The topic of program efficiency is receiving a lot of attention these days.  Facts and/or rumors abound on how to write source programs that generate efficient code.  This memo is an attempt to write down some guidelines that you can follow to make the compiler's life easier.  I probably won't think of everything the first time, so I would appreciate your comments and suggestions.  Please send them to me rather than broadcasting them to the world of Mesa programmers.  This serves two ends: it cuts down on the mail traffic, and it allows me to clarify half-truths before they become part of the folk tradition.e12jk40General Commentse18jk40\b16BThe Mesa virtual machine was designed after a statistical analysis of a large collection of Mesa(like) programs.  Thus, operations that were likely to occur frequently were given short representations in the object program, similar to Huffman coding.  As the language has matured, and the applications of the language have changed, the instruction mix of programs has changed as well.  There has already been one re-analysis of code, accompanied by some adjustments in the virtual machine.  Another such re-analysis is in the works.  For this reason, I will try to be careful to note those guidelines that are artifacts of the current instruction set.e12jk40Compiler Commentse18jk40\b17BThe Mesa compiler generates surprisingly good code for many constructs in the language; this is partly because of the good match of the instruction set to the language.  It has no global optimizer in the classical sense, and only rudimentary flow analysis.  On the other hand, there are some language features, such as "=" initialization, that the programmer can use to tell the compiler things that it cannot deduce.e12jk40Some of the guidelines are to get around simple minded algorithms that may be improved in later releases.  Some may not result in better code with the current compiler, but will make your programs well suited for planned improvements in code generation.e12jk40Coding Guidelinese18jk40\b17BThe annotations on the guidelines have the following meanings:e12jk40[basic]	Good Mesa coding practice, unlikely to become inoperative by subsequent changes in the instruction set or compiler.l6368d4269e6jk40(0,6368)(1,65535)(5,65535)(6,65535)\b7B[inst set]	Instruction set shortcoming, may change after the upcoming instruction set analysis and readjustment to the architecture.l6368d4269e6jk40\b10B[compiler]	Compiler shortcoming, may change with improvements to the compiler.l6368d4269e6jk40\b10B[These need to be reordered before the memo gets wide distribution]e12jk40(2116)Keep your global frames small. [basic]e12jk40\i30I1b7BThis was particularly important on the Alto since Global frames were resident.  It is also important in Pilot as they premanently take up MDS space.  See also the comment on small local frames.l4269e6jk40Keep your local frames small. [basic] [compiler]e12jk40\i29I1b18B[basic] Local (and global) variables are sorted by static frequency of use and the most popular ones are at the beginning of the frame.  For the very earily ones, there are single byte load and store instructions.  Somewhat less popular ones take two bytes.  Local and global variables that are "way out in the Sticks" take up to seven bytes to load or store.l4269e6jk40\b8B[compiler] Most temporary variables are allocated after the frame has been laid out and hence fall at the end.  This leads to really awful code when temporaries are needed in procedures with large frames.l4269e6jk40\b10BAvoid very complicated expressions. [compiler] [inst set]e12jk40\i35I1b21B[compiler] Expressions are evaluated using a stack.  Unlike some conventional architectures, there is a fixed size stack, usually in microprocessor registers, and it is the responsibility of the compiler to keep it from overflowing.  The compiler does some rearrangement of commutative operations to minimize stack requirements (but not AND or OR, which are guaranteed to be left to right).  Once it decides on an order, it generates code in a one pass fashion, keeping temporary results on the stack.  If the stack gets too full, it is dumped and partially reloaded by a rather simpleminded (read ugly) algorithm.l4269e6jk40\b11B[inst set] Part of the reason for the simpleminded algorithm for dealing with stack overflow was that it "didn't happen."  That was before the days of long pointers and long numeric types.  It appears that the adjusted architecture will have a deeper stack than that of Mesa 6.  If so, this problem might go away again.l4269e6jk40\b10BPass large data structures by reference when possible. [basic] [inst set]e12jk40\i54I1b18B[basic] Mesa has only call-by-value semantics.  If you need to call a procedure with a fifty word array, the simple minded approach would require copying all fifty words in both the caller and callee.  It is far better to change the parameter to be a POINTER TO ARRAY or a DESCRIPTOR FOR ARRAY.  Better still, make it a POINTER TO READONLY ARRAY if the callee isn't going to alter its elements.l4269e6jk40\b8B243f7 16f0 5f7 21f0 27f7 25f0[inst set] It is common knowledge that in Mesa 6, five words of parameters are passed on the stack, and larger argument records are passed less efficiently.  With a deeper stack, larger records will be passed on the stack.l4269e6jk40\b10BTell the compiler when you don't plan to change a variable. [basic]e12jk40\i59I1b7BI said above that the compiler has only rudimentary flow analysis.  With the presence of LOOPHOLE and POINTER TO UNSPECIFIED, accurate global flow analysis is probably impossible for most non-compiler-generated variables.  For this reason, it is important to help the compiler whenever you can.  Thus, if you want to evaluate a complicated expression at the beginning of a block of code and put it into a simple variable to use therein, you should declare the variable at that point and give it "=" initialization.  Likewise, if you have a procedure that takes a POINTER TO Foo as one of its parameters, and it doesn't change the value of the Foo, you should make the parameter a POINTER TO READONLY Foo.l4269e6jk40\89f7 8f0 5f7 22f0 439f7 10f0 1i3I66i3I34f7 19f0 1i3IFor example, this allows the compiler to avoid copying variables into temporaries when calling INLINE procedures.l4269e6jk40\f1 95f7 6f1 12f0Initialize variables (particularly records) at point of creation. [basic]e12jk40\i65I1b7BVariables are created either by declaration or by the NEW operator.  The most obvious merit of the guideline is that the variable cannot be used before it is initialized in this case.  A somewhat more subtle reason is that the compiler has much more flexibility in generating the code for record (or array) constructors if it knows that initializing expressions cannot involve the current contents of the record.  The Mesa 6 compiler does not take advantage of this fact, but the next one almost surely will.l4269e6jk40\54f7 3f0Use inline procedures wisely. [compiler]e12jk40\i29I1b10BSections 5.6 and 7.3.4 of the Mesa Language Manual, Version 5.0 contain hints on the use of INLINE procedures.  In particular, section 7.3.4 gives circumstances under which "the inline expansion is particularly efficient."  Indiscriminate use of inline procedures that do not meet those criteria can lead to rather ugly code.  l4269e6jk40\30i33I29f7 6f0Use cross jumping. [basic]e12jk40\i18I1b7BCross jumping is a peephole optimization technique of sharing common tails in the various branches of conditional statements.  Cross jumped programs don't execute any fewer instructions, but the code bodies can be significantly smaller.  While cross jumping does degrade the source/object correspondence, it is still possible to set entry and exit breaks on procedures.  I have been using cross jumping for at least the last year and have had trouble debugging only once or twice.  In addition, there are some other optimizations that degrade the source/object mapping that are applied only when the "/j" switch is specified.l4269e6jk40\601f8 2f0If you know that you will be cross jumping your code, you can sometimes make it more readable by letting the compiler merge common tails rather than doing it by hand with GO TO's.l4269e6jk40\f1 171f7 5f1 2f0Dont use small constant packed arrays. [inst set] [compiler]e12jk40\i38I1b21BThe Compiler generates reasonable code for loading elements of constant packed arrays, but not as good as it does for the non-packed case.  For small arrays (say ten elements) it is better to declare the array not packed.  This generates faster accesses at the cost of a larger array constant.  The size of the code segment is likely to be no bigger since the larger constant is offset by smaller access code.l4269e6jk40Don't use NIL checking in released code. [compiler]e12jk40\i10f7I3f0i27I1b10Bcompiler doesn't use magic instructions when NIL checking, next compiler will take into account mapped out page zero to do better job.l4269e6jk40\45f7 3f0Use NEW for allocating dynamic storage. [basic]e12jk40\i4f7I3f0i32I1b7Bolder loophole (unspecified) allocation techniques bad idea, see also initialize at creation.l4269e6jk40\70i22IUse OPEN of complicated expressions cautiously. [basic]e12jk40\i4f7I4f0i39I1b7Bsubstitute by name, not by value.  important if value of expression can legitimately change within open.  WITH as an open.l4269e6jk40\106f7 4f0Use POINTER TO ARRAY for large arrays. [basic] [compiler]e12jk40\i4f7I16f0i18I1b18B[basic] if dynamically allocated, keeps frames small; generates as good or better code than array in frame (usually better).l4269e6jk40\b8B[compiler] temporary allocationl4269e6jk40\b11BSource of dynamic storage vs lifetime. [basic]e12jk40\i38I1b7Bany ideas?  Storage.pages vs z.NEW, etc. sandbars, private ZONEs vs system-provided ones.l4269e6jk40\59f7 4f0e12jk40e12j