Designs for Multiple Values in Interlisp-D Larry Masinter, Bill van Melle, Last reviesed 7-Mar-86 17:34:15 Introduction Common Lisp supports the notion that a function (or form) can return ``multiple values'', rather than the usual one value. The specification is based largely on experience with ZetaLisp, with a couple of spots cleaned up. Multiple values are produced by a special form (values v1 v2 ...). There are several forms that ``receive'' multiple values and spread them in some way that they can be manipulated as regular values: e.g., the receiver can bind one or more variables to the values, can set variables to the values, or can build a list of the values. For consistency, (values) returns zero values, although some receivers cannot distinguish this from the single value NIL. If the receiver is not expecting multiple values, then exactly one value (the first) is returned. If the receiver is expecting multiple values, but only a conventional single value is returned, that value is treated as one ``multiple value''; in fact, one ``multiple value'' is completely indistinguishable from a conventional single value from the program's point of view. Thus, roughly speaking, you only get multiple values if you are expecting them, and as a creator of multiple values you need not care whether your caller actually wanted multiple values. This is a nice feature that Common Lisp makes use of in many places: functions that principally return one value are defined to actually return several values in case the caller is interested in them all. An example of this might be a division function, which returns the quotient as its (first) value, but also returns the remainder as a second value, since it had to compute it anyway, and some callers actually do want the remainder (in Common Lisp, the function floor is so defined). The main complication in multiple values arises from the fact that most subforms that appear to transparently pass thru a value in the conventional way are also required to transparently pass thru multiple values as well. Roughly speaking, progn and anything that is in some way an implicit progn must return multiple values if its final form returns multiple values. Multiple values cannot be assigned to a variable, nor passed as arguments to functions; such cases always act as single-valued. There exist special forms for passing multiple values thru a prog1, and for spreading multiple values as arguments to a function. Unfortunately, this transparency requirement violates the notion that you can always tell by looking at a piece of code in isolation, whether multiple values are involved. It means the implementation needs to be clever enough that ordinary call and return are not substantially penalized by needing to account for the possibility that multiple values might somehow get involved. It also means that certain compiler optimizations cannot be made, although I believe this has relatively minor impact. We've made a couple of designs for Multiple Values. The current system as of 7-Mar-86 17:34:40 implements design I-A, Very Cheap Implementation of Multiple Values. DESIGN I: A cheap implementation of multiple values This is based on a message Masinter sent last year plus van Melle's earlier design for a high-performance version. The main idea is that most of the work is in the multiple-value returner. It looks inside the caller to see in what form, if any, the caller wants to see the values. The compiled code in the caller is arranged so that if multiple values are not being returned, the code following a multiple-value call computes the appropriate null values; if multiple values are being returned, the returner takes care of the values and adjusts the caller's PC to skip that code. Values are passed transparently thru tail-recursive calls by virtue of the multiple-value returner seeing the opcode RETURN following the call; in such cases, it iterates the procedure on that frame's caller. Compiler Architecture There are three interesting "compilation contexts" for multiple values, only two of which are really needed: Multiple Values as a listfor the special form multiple-value-list. The caller expects to see a list of values. Exactly n multiple values spreadfor the special forms multiple-value-bind and multiple-value-setq. The caller expects to see exactly n values (n is known at compile time) pushed on the stack. When a "return value" form is compiled in one of these contexts, special code is executed to assure that the value is in the desired form. If the form is a function call (the usual case), a special opcode is emitted after the function call whose purpose is to munge a single value into multiple form, or be recognized (and skipped) by a multiple value returner. If the form is a multiple-value returning form, the values are produced in the desired form. If the form is any other non-multiple-value returning form, its single value is coerced into a "single multiple value" of the desired form. In order to avoid hair in unbinding, any binding form (that really requires a dynamic bind, e.g., of a special variable) encountered while in the "spread" context must be compiled as a separate frame. A third context might be desired: Multiple values spread plus a countfor the special form multiple-value-call. The caller expects to see some number of values (not known at compile time) plus a count of those values. However, this context is not really needed if we choose to define multiple-value-call in the following obvious but inefficient fashion: (defmacro multiple-value-call (fn . forms) (apply ,fn (nconc @,(for f in forms collect (multiple-value-list ,f))))). Multiple Value Receiving Opcodes I am describing this as an opcode, though initially it could as well be a function call. It is only executed in the case where a single value was returned to it. MVLISTreceives values as a list. The opcode's action is "CONSNIL", i.e., (lambda (x) (list x)). Compilation of Multiple Value forms (multiple-value-list form) Compile form in multiple-value-list context. That is, any terminal function call in form is followed with MVLIST. Any terminal single-valued form is compiled as list of one element. Any terminal (values v1 ... vm) form compiles directly as list. (multiple-value-setq v1 ... vn form) Compile form in Spread[n] context. That is, any terminal single-valued form is followed by an explicit push of n$1 NIL's. Any terminal (values v1 ... vm) form simply compiles (pushes) the m values, then pops excess values or pushes n$m NIL's. Any terminal function call in form is followed with MVSPREAD[n], which is defined as MVLIST; n$1*[COPY CAR SWAP CDR]; CAR All of the above cases are then followed by SETQ vn; POP; SETQ vn-1; POP; ... SETQ v1 Net stack level effect: one push. Alternatively, can optimize the function call case by doing the spread and the SETQ's together: MVLIST; for i from 1 to n-1 compile [COPY CAR SETQ vi POP CDR]; CAR; SETQ vn; v1 the final push of v1 being needed because the value of the multiple-value-setq form is the first value. (multiple-value-bind (v1 ... vn) form . other-forms) Compile form in Spread[n] context, then BIND [v1; ...; vn]; (progn . other-forms) followed by the appropriate unbind or return. Larry indicates possible fakery by compiling it as (let ((v1 (magic-spread n form)) (v2 (nothing)) ... (vn (nothing))) . other-forms), where magic-spread does the spreading as described above, and (nothing) compiles as something that pushes nothing but fools the compiler into thinking it had. Of course, the compiling context of the multiple-value-bind itself is transmitted to the final subform of the inside progn. If none of the bound variables is special, this whole mess could just be a multiple-value-setq with the local bindings established earlier. (multiple-value-prog1 form . other-forms) (values-list (prog1 (multiple-value-list form) . other-forms)). (values v1 ... vn) If in return context, compile closed, followed by RETURN (function VALUES is described below). If not in one of the multiple value contexts, compile as prog1. (values-list list) If in return context, compile closed, followed by RETURN. If not in one of the multiple value contexts, is no-op. If in multiple-value-list context, compile as the identity (i.e., list itself is the desired list). If in spread context, spread list by the appropriate MVSPREAD[n], described above. The function VALUES VALUES is a normal Interlisp lambda* function. It is called by interpreted code and by the values special form in return context. VALUES examines its caller's caller (hereafter referred to as the returnee) to see whether and how multiple values are desired. Specifically, it fetches the byte at the pc indicated in the returnee's frame, and handles one of 3 cases: RETURNThe returnee executed a tail-recursive function call; thus multiple values must be transmitted transparently. Set returnee to be the returnee's caller and iterate. MVLISTThe returnee expects values as a list. Increment the returnee's pc (so that it skips the MVLIST instruction). Construct and return (normally, i.e., from VALUES to its caller) a list of VALUES's arguments. Anything elseThe returnee does not expect multiple values, so just return (normally) the first argument to VALUES. Notes: VALUES could be unwinding the stack as it goes, but by returning normally we can take advantage of regular RETURN microcode to do the unwinding quickly. The function VALUES-LIST is analogous to VALUES, except it takes only one argument, which is the list already bundled up. This operation of VALUES makes it important that the compiler not emit a RETURN where multiple values are expressly inhibited, and also not rearrange instructions in such a way that an MVLIST is moved away from immediately after the function call it follows. Examples of the former: (let (a) (setq a (foo)) a) (values (foo)) Neither of these expressions is equivalent to (foo), as both are required to return a single value. It might be necessary for the compiler to insert a NOP before a RETURN that follows either of those forms. CONS-less version One could imagine the following version of "spreading" that did not cons. However, the stack munging involved is almost surely more expensive than the consing. Here it is for the record, in case some part of it seems suitable for microcode boost: Distinguish spread from MVLIST by another opcode: MVSPREAD.Nreceives N values spread. The opcode pushes N$1 NIL's on the stack. Instead of the hairy destructuring sequence of opcodes, the spread context compiles as the single opcode MVSPREAD.N (with N the alpha byte). Then the function VALUES has another case to worry about: MVSPREAD.NThe returnee expects n values spread on stack. This is the least pleasant case, as we need to extend returnee's frame. Free the frames between here and returnee (preparing to return from here to returnee). Fetch the instruction's alpha byte n, and increment its pc beyond the whole instruction. Then perform the following "atomically", probably in misc context: Ensure that there is space to extend returnee's frame by n cells, i.e., a free block of sufficient length follows frame; if not, need to copy returnee as in stack overflow. Lengthen returnee's frame by n cells, and blt n values from VALUES's arguments to those last n cells; if VALUES received fewer than n arguments, store NIL's for the remaining values. Finally, return to returnee. It is unpleasantly inefficient to use the misc context to extend returnee's frame in the MVSPREAD case. If \FREESTACKBLOCK didn't think guard blocks were fair game, we could optimize the most common case, viz., returnee immediately followed by its callee, which frame is larger than the set of values to be returned: uninterruptably splice returnee's callee out of the stack, turn it into a guard block, store a free block header in the place following what will be the end of callee's frame, lengthen callee's frame, blt values. The stack remains parseable at all times during this operation. The ufn for MVSPREAD.N bears great similarity to the MVSPREAD case of VALUES, but without the handiness of having a destroyable frame following the frame being extended. We could optimize MVSPREAD.N in the common case of some small number of values by defining the ufn as a function of, say, 6 arguments. The microcode would then make space for 6 NIL's to be pushed as a side effect of calling the ufn, and the ufn need only adjust its BF to be smaller (and store NIL into its first arg). On the other hand, MVSPREAD.N ought to be easily microcoded. DESIGN IA- A VERY CHEAP IMPLEMENTATION OF MULTIPLE VALUES A little of what is described as "compiler work" is done instead by the macros. There are no compilation contexts. Instead of a MVLIST "opcode" there is just a function call, to the function \MVLIST, whose definition is (LAMBDA (X) (LIST X)). (multiple-value-list form) usually just computes (\MVLIST form)). The macro looks at form, however, and will open code (multiple-value-list (values 1 2 3)) => (list 1 2 3). All the other macros are defined in terms of multiple-value-list, e.g. (multiple-value-setq v1 ... vn form) compiles as (let((var (multiple-value-list form))) (setq v1 (car var))(setq v2 (cadr var)) ...), (multiple-value-bind (v1 ... vn) form . other-forms) compiles as (destructuring-bind (v1 ... vn) (multiple-value-list form) . other-forms) (values v1 ... vn) and (values-list list) always compile closed (noting, however, that multiple-value-list will "spread" them where necessary), but otherwise work as described above. A few places in the Interlisp interpreter were modified to really pass back values where appropriate, and interpreted RETURN was changed to return back multiple values to the enclosing PROG if the form itself returned multiple values. DESIGN II: A complex implementation of multiple values Outline of Design The design presented here is based on the implementation for Spice Lisp, another microcoded implementation. The general idea is that multiple values appear on the stack spread as several values, with the top of stack being a distinguished marker that identifies the values as multiple and says how many there are. Forms that create multiple values push the values and then this marker. Forms that accept multiple values have special code that tests for the marker, push or pop values to get the stack into the state where it has the number of values it is expecting, then does something with the values. Everything is so arranged that only the few instructions that ``know about'' multiple values are ever executed when the top of stack might be a multiple values marker. The Multiple Values Marker. In a tagged architecture, the marker is easy to define: the marker is an immediate datatype whose tag is ``Multiple Values Marker'' and whose immediate datum is the number of values. We do not have the luxury of tags in the present implementation (though we may wish to plan on them for future implementations). Hence, we need to pick a reserved address or range of addresses not otherwise used. Thus, the choices are (a) choose a single address (say, all ones) and implement the marker as two cells, the top one being this distinguished address and the next being the number of values, as a smallp; or (b) choose a range of addresses, the low (or high) bits of which encode the number of values. For (b) the most likely choice is to choose the address {377,177400}+nvalues, since that page of virtual memory is currently forbidden. Unfortunately, the test for multiple-value-markerp is non-trivial. Encoding the number of values in the high half, with a constant low half, would be a possibility but for the problem that there exist some high halves (litatom, smallpos, smallneg) for which any low half is a reasonable value. In what follows, I will use MVMarker(n) to refer to a Multiple Values Marker for n values, independent of the final representation we choose, be it one cell or two. My current preference is for the one-celled {377,177400}+n representation, which yields a multiple-values-limit of 255. Multiple Values Calling Forms. Every end receiver of multiple values is known at compile time, by virtue of being one of the special receiving forms. A receiver makes known its desire to receive multiple values from a function call by using instructions that set a bit (in somebody's frame) signifying that it is willing to accept multiple values (the bit is tested by return). On return from the call, the caller executes instructions that coerce the top of stack to multiple values (treating no multiple values as exactly one), adjust the stack if necessary (popping excess values, pushing NILs), and finally perform the appropriate operation, depending on the calling form. There are several ways of calling functions (direct function call, apply, apply*, eval, and potentially the ufn's for instructions that could produce multiple values), so either there need to be multiple-valued counterparts to each such instruction (e.g., MVFNX, MVAPPLYFN, MVEVAL), or there needs to be an instruction to compound with the conventional instructions to obtain the desired effect. Open issue: where is the ``Multiple Values Acceptable'' (MVA) flag stored? The choices appear to be as follows: (a) Set the bit in the callee's frame; in other words, treat the bit as ``it is ok for me to return multiple values''. This, I believe, is how it is done in SpiceLisp. It is appealing in that it would set a flag in a location that function call has to write anyway (flag word of the new frame), and would not require any explicit action to clear the bit following the return. However, it might not make sense in the spaghetti stack architecture. In particular, we would have to prohibit spaghetti constructs (e.g., RETEVAL) from returning multiple values, since they could have no way of knowing whether the frame being returned to is willing to accept multiple values. This is a restriction I would not like to arbitrarily impose. (b) Set the bit in the caller's frame; in other words, treat the bit as ``I am prepared to have multiple values returned to me''. This seems to make the most logical sense, works with spaghetti without any changes in the current code, and in addition allows the possibility of having a special instruction to set the flag, rather than require several new instructions that act like old instructions but also set the flag. Its principal drawbacks are the slight performance hit (writing a word that otherwise wouldn't need it) and the fact that explicit code is then needed to turn off the bit in the code following the function call (impact: one extra instruction for multiple-value-prog1). In what follows, I shall assume the flag is in the caller's frame. I ask the microcoders to evaluate the expense. The design is easily changed to allow flag in callee's frame instead if we decide we prefer that. The opcode sequences do not change substantially, except for the ones that would set or clear the flag. Transparent Calling Forms. There is only one way to handle multiple values without explicitly being aware of it: returning directly, without any intermediate processing, the result of a function call. The ``returning'' part can be either a return from the entire function, or a return (via unbinding) from an internal binding form (LET or PROG). In terms of the code executed, the former is a RETURN, the latter an UNBIND. To handle these cases, all tail-recursive function calls (those whose values are to be returned from the current function or to a multiple-value receiver; affected opcodes are FNx, APPLYFN and EVAL) must be compiled as an instruction or instructions that set the MVA flag, in case multiple values are to be transmitted. In addition, the returning and unbinding instructions need to be aware of the possible existence of multiple values. Compiler Architecture. I believe the major constraints for the compiler can be handled by introducing a new compilation context. It is currently the case, I believe, that the compiler has a notion of ``compile expression for value''. This notion can be more finely discriminated. One special kind of value is ``compile expression for return value'', i.e., the value of the expression is to be returned from the function; such a context might even already exist as a means of detecting tail recursion to be eliminated. I propose to add another kind of value context, ``compile expression for multiple values'', meaning that multiple values are explicitly requested from the expression being compiled. Multiple value receiving forms compile one or more expressions in value context MULTIPLE. I shall write ``WithMV[form]'' to mean ``compile form in context MULTIPLE''. In terms of implementation, if there is currently a compile-for-value argument to the expression compiler, its range of values could be expanded to include RETURN and MULTIPLE; both imply for value. RETURN and MULTIPLE are mutually exclusive (MULTIPLE can only arise from a receiver within the function, and RETURN is never transmitted through the receiving form). In addition, transmission of MULTIPLE value context is largely like the transmission of RETURN context. The top-level expression of a function is compiled in RETURN context. The final subform of a progn (or implicit progn) is compiled in the same context as its parent. A return in a prog is compiled in the same context as the prog. Those are all true for both MULTIPLE and RETURN context. There are some specific cases where RETURN context is not transmitted. If a prog1 is compiled in RETURN context, then its first subform will indeed be the value returned from the function, but I do not consider this RETURN context, because this is not tail-recursive (a function call here could not be turned into a jump). Similarly, only the final subform of and or or can be compiled in RETURN context, and a singleton cond clause can never be compiled in RETURN context. Fortunately, these rules agree exactly with the rules regarding when multiple values can be transmitted. With all this in hand, I shall define an abstract compilation context ``multiple values possible'' (MVP), which means ``the form I am compiling is permitted to produce multiple values''. MVP is true when the compilation context is MULTIPLE or RETURN. The compiler's new job is to treat any form of function call in MVP context specially (issuing the instructions that set the MVA flag) In addition, any RETURN or UNBIND in MVP context is potentially special, as it must do the right thing if the top of stack is a set of multiple values. In the case of RETURN, all values must get returned. In the case of UNBIND, the compiler does not know what the actual stack depth is, although it knows the logical stack depth (treating the potentially multiple values as one); whatever does the unbinding has to move potentially multiple values, while decrementing the stack depth the logical amount required. These can be handled either by defining new opcodes MVRETURN and MVUNBIND, which are identical to their companions except for doing something special when the top of stack is MVMarker(n), or by extending the existing RETURN or UNBIND to do the right thing in those cases. For the remainder of the discussion, I shall assume that the compiler emits the MVRETURN and MVUNBIND opcodes in MVP context, although the choice of new opcodes vs. extending the old ones is arbitrary and has no effect on the rest of the design. Specific constructs for producing Multiple Values (values v1 ... vn) This is the basic multiple-value returning form. It compiles as a push of its arguments (as with any other function), then push MVMarker(n). This must be followed by an MVRETURN, for the usual case of a function returning multiple values, or an MVUNBIND, if the form is returning multiple values within the same function, e.g., from an internal LET or PROG to an enclosing multiple-value catcher. If values is compiled in other than MVP context (does not make much sense as a program, but each case must be covered) then instead of pushing MVMarker(n), it compiles as a pop of the n-1 excess values, or a push of NIL if n = 0. (values-list list) This is the only other way to return multiple values; it is the same as (apply (function values) list). It can be compiled open, in much the same form as we compile the central part of EVAL. The expression compiles as something like '0; BIND[temp]; list; lp: LISTP; NFJUMP exit; COPY; CAR; SWAP; CDR; temp; '1; IPLUS2; POP; JUMP lp; exit: temp; MVMARKER with a suitable MVRETURN or MVUNBIND following as needed. To protect against circularity, one could replace the POP with a range check against the constant multiple-values-limit. It is likely that we want the function values-list itself to compile in this fashion, and internal uses of values-list in the interpreter surely want to compile open with no range check (it is convenient to pass values around using multiple-value-list and values-list, and the value of multiple-value-list is guaranteed to be in range). However, it is an open question whether user calls should compile open or closed. The MVMARKER opcode This opcode is used to perform the ``push MVMarker(n)'' operation described above. Definition: If TOS is not a smallposp less than multiple-values-limit (255 following the current proposal), then punt; else replace TOS with MVMarker(TOS). The MVRETURN opcode MVRETURN is the same as RETURN except that it has to notice whether it is returning multiple values. So extend RETURN as follows: if top of stack is MVMarker(n), then check MVA flag (in returnee). Two cases: If flag is off, then multiple values are not acceptable. If n = 0, replace TOS with NIL. Otherwise, pop the MVMarker and n-1 values off of returner's stack, leaving TOS containing the first of the n values. Proceed with normal return. If flag is on, then multiple values are acceptable. Proceed with normal return, but remember where TOS-n, the location of the first of the multiple values, is stored. At end of return, go into a block transfer loop, transferring n values, in order, from returner's stack to top of returnee's stack, finally pushing the same MVMarker(n) as the returnee's new TOS (thus, if TOS is a register, it is preserved, just as it is with normal return). Remember than n can be zero, in which case the block transfer loop terminates without doing anything. Note that the MVA flag is never tested unless actually returning multiple values. No special action is required if returning a single value to a multiple-value receiver. If MVRETURN = RETURN, then the minimum microcode requirement for the first implementation is that RETURN test for TOS = MVMarker(n), and punt if it is. A second stopping point could be to handle the flag is off case in microcode, where all the microcode needs to do is adjust the returner's stack before doing the normal return. The MVUNBIND opcode Similarly, MVUNBIND has to notice whether the top of stack is a set of multiple values, and if so, preserve more than one value as it pops items from the stack. This seems fairly unpleasant, but intrinsically necessary to compile such forms as (multiple-value-bind & (let -- (foo)) --). Thus, MVUNBIND is the same as UNBIND unless TOS = MVMarker(n), in which case it needs to note TOS-n, the location of the first of the multiple values, is stored. It then does a regular UNBIND, but at the end, goes into a block transfer loop, transferring n values, in order, from the old position on the stack to the frame's new top of stack, finally pushing the same MVMarker(n) as the frame's new TOS. Specific constructs for receiving Multiple Values A receiver of multiple values generally does one of two things: (1) It expects exactly n values (n known at compile time), and it will spread them one at a time into some other structure, e.g., into local variables. In this case, the receiver needs to first coerce the stack into having exactly n multiple values, either by popping excess values or by pushing extra NILs. (2) It expects an arbitrary number of values, and merely needs to know how many it is about to do something with. The following new opcodes are defined: MVPREPARE Sets the MVA flag in the current frame. This instruction is issued immediately preceding any calling instruction (FNx, APPLYFN, EVAL) in MVP context. It is desirable that this be a fast, single-byte opcode, since it needs to be issued for all tail-recursive calls, even for code that never heard of multiple values. In a sampling from a random full.sysout with only a few extra modules loaded, approximately 8% of all calling instructions were tail-recursive; MVPREPARE before each of those opcodes would have added some 3800 bytes (7.5 pages) to that sysout. This opcode is not necessary, of course, if we instead define new calling opcodes. MVCLEAR Clear the MVA flag in the current frame. Used only for multiple-value-prog1. MVFORCE ``Tell me how many values I have.'' If TOS = MVMarker(n), then replace it with the smallposp n; else push the small constant 1. In either case, clear the MVA flag in the current frame. STACKADJUST TOS is a small integer. Pop it off stack and call it n. If n is positive, pop stack n more times; if n is negative, push n NILs. Both MVFORCE and STACKADJUST (not to mention multiple-value function calls themselves) have an indeterminate affect on stack depth, although they are always used in sequences whose net effect is determinate. If necessary, introduce composite opcodes or pseudo-opcodes to help the compiler keep track. With these opcodes in hand, the various receivers are described as follows: (multiple-value-setq v1 ... vn form) This sets the n variables v1 thru vn to the values returned by form and returns as value the first of the multiple values (what v1 gets set to). It compiles as WithMV[form]; MVFORCE; SIC n; IDIFFERENCE; STACKADJUST; SETQ vn; POP; SETQ vn-1; POP; ... SETQ v1 Net stack level effect: one push. The sequence of instructions MVFORCE; SIC n; IDIFFERENCE; STACKADJUST has stack level effect of +n-1 (if you assume that the mv-producing form itself contributed exactly 1 to stack depth, as normal forms do). (multiple-value-bind (v1 ... vn) form . other-forms) This evaluates (progn . other-forms) with variables v1 thru vn bound to the values returned by form. It compiles as WithMV[form]; MVFORCE; SIC n; IDIFFERENCE; STACKADJUST; BIND [v1; ...; vn]; (progn . other-forms) followed by the appropriate unbind or return. Of course, the compiling context of the multiple-value-bind is transmitted to the final subform of the inside progn. (multiple-value-prog1 form . other-forms) This is the same as conventional prog1 except that it can return multiple values if form does. In other words: (a) if the multiple-value-prog1 is not in MVP context, then compile as prog1. (b) if the multiple-value-prog1 is in MVP context, then compile form in context MULTIPLE, the remaining forms for effect. In addition, if form's value is computed by function call, the call must be followed by MVCLEAR to clear the MVA flag (which ordinarily would be cleared by MVFORCE, not used in this case). Of course, any return or unbind following the multiple-value-prog1 must be of the mv variety (since you're in MVP context). Note that the ordinary function prog1 never transmits its compile context to its first subform; i.e., prog1 never returns multiple values. For an inefficient implementation, multiple-value-prog1 could be compiled as (values-list (prog1 (multiple-value-list form) . other-forms)). (multiple-value-list form) This evaluates form and returns a list of all the values. It compiles as WithMV[form]; MVFORCE; 'LIST; APPLYFN i.e., call the function LIST with the n arguments that come back from form. SpiceLisp has a LIST opcode that does this, but I don't think we want that. (multiple-value-call fn form1 ... formn) This is effectively an apply* (i.e., funcall) of fn with arguments being the result of spreading all the multiple values returned by form1 thru formn. It compiles as '0; BIND[temp]; WithMV[form1]; MVFORCE; temp; IPLUS2; SETQ temp; POP; WithMV[form2]; ... ; SETQ temp; POP; WithMV[formn]; MVFORCE; temp; IPLUS2; fn; CHECKAPPLY*; APPLYFN followed by a suitable unbind or return. The temporary variable keeps track of how many values are accumulating on the stack. Eventually, APPLYFN is called to do the function call with the total number of arguments. The temporary binding is not needed if n = 1, in which case multiple-value-call can be compiled more simply as WithMV[form1]; MVFORCE; fn; CHECKAPPLY*; APPLYFN Of course, in either case, if fn is a side-affectable form instead of a constant, a temporary binding will be needed for it as well. And if the multiple-value-call form itself is in MVP context, then the CHECKAPPLY* must be preceded by MVPREPARE. For an inefficient implementation, multiple-value-call can be compiled as (apply fn (append (multiple-value-list form1) (multiple-value-list form2) . . . (multiple-value-list formn))). Interpreter Of course, interpreted versions need to be written for all of the multiple value forms described above. In addition, the macros for SPREADAPPLY*, SPREADAPPLY, .CALLAFTERPUSHINGNILS., and .EVALFORM., which hand-compile special versions of function call and APPLYFN in the interpreter, may need to be modified to transmit MVP context to their call or APPLYFN if the compiler doesn't catch those automatically. The function RETURN needs to be an NLAMBDA that manually evaluates its argument and saves the potentially multiple values that come back, before smashing its return link to the appropriate PROG frame. The function \INTERPRETER1, which implements calls to interpreted fns, needs to be careful to return the evaluation of its final subform directly, without setting an intermediate value. There is a similar problem with \EVPROGN and interpreted OR (AND and COND are already written correctly, though). The ufn for the opcode CHECKAPPLY*, which intercepts compiled calls to APPLY* of NLAMBDA nospreads, which currently returns `(LAMBDA NIL (QUOTE ,(SPREADAPPLY* FN (\CKAPPLYARGS)))), needs to instead return `(LAMBDA NIL (VALUES-LIST (QUOTE ,(MULTIPLE-VALUE-LIST (SPREADAPPLY* FN (\CKAPPLYARGS)))))), and similarly for its FAULTAPPLY (udf) case. The PROG1 in EVALHOOK needs to be a MULTIPLE-VALUE-PROG1. Actually, we might want to rip out the current EVALHOOK, since it does not meet the CommonLisp spec. Its inclusion in the Harmony release (and notes) was a violation of our usual policy on new features. The spaghetti functions (RETEVAL, ENVEVAL, etc) appear to be okay as they stand. Break and Advise Advising a function should, of course, not interfere with the transmission of multiple values. The simplest way to insure this is for the advising form to wrap a (multiple-value-list &) around the value coming back, set a new variable !VALUES to this list, and finally return (values-list !VALUES) following any ``after'' advice. For backward compatibility, also set !VALUE to (CAR !VALUES). The same is true for the break package, except it is now BREAK1 and its underlings that must gather up all the values into a list and spread the list on exit. The break command VALUE could be extended to print the values in some nice fashion when the number of values is not 1. Additional Compiler Considerations Stack Depth. While multiple values are on the stack, the compiler does not know the absolute stack depth. This is usually unimportant, since multiple values do not usually remain on the stackthey are usually immediately returned, or passed up within a function by doing an unbind. So I believe the only effect is to prohibit the compiler from using COPY.N and STORE.N optimizations while compiling: (a) the final n-1 subforms in a multiple-value-prog1; (b) all but the first of the forms in a multiple-value-call (and, of course, the temporary variable introduced must be a pvar); In addition, the compiler cannot do an unbind by performing one or more POPs in a place where MVUNBIND is called for (also, any variables to be unbound in MVP context must be pvars). Nasty question: what do you do with a GO inside of one of the arguments to multiple-value-call, or in the extra subforms of a multiple-value-prog1? I think these are the only places you could write GO where the stack depth is indeterminate, and I would not be at all disturbed to forbid it. In the case of multiple-value-prog1, you could handle it by popping everything you know to be on the stack after the first value, then do MVFORCE; STACKADJUST. I can't think of a way to handle the multiple-value-call case, however, unless you really do want to rely on UNBIND blowing away the stack up to the bind mark, something we were trying to get away from. Tail Recursion Removal. I believe this is unaffected. If the call to function FOO permits multiple values to be returned, then whether you perform a tail-recursive call to FOO as a function call with MVA set, or as a jump, the right thing happens. Other Compiler Optimizations. Larry suggests a possible problem with optimizing away the binding of a variable to the value of a function call by replacing the single use of that variable with the call. For example, transforming (prog ((a (foo))) (return a)) into (foo). The latter might return multiple values, whereas the semantics of the first dictate only a single value. Larry: is this an important transformation, and is the transformation at the source level or at the intermediate code level? If this optimization occurs after the first pass compilation of this expression, then there is no problemthe call to foo is not in MVP context, so is compiled as a regular call, and will not return multiple values then, nor will it when it is rearranged to remove the extraneous bind. Larry, are there any other problem optimizations? Summary of Requirements Microcode. Instructions are MVRETURN, MVUNBIND, MVPREPARE, MVCLEAR, MVMARKER, MVFORCE, STACKADJUST. The instructions have been designed so that an initial (very inefficient) implementation is possible with no microcode support at all. Of course, some of the ufns are rather unpleasant; most will require the same ugly trick as we use for the unboxed floating ufnsgrab caller's stack pointer and explicitly manipulate the stack. Most of them need to set NoPush bit so that the ufn returns no value (we can't return a value if it is the MVMarker, since this would confuse RETURN (if MVRETURN is considered the same as RETURN)). Popping caller's stack is much easier than pushing, since the latter bumps into the next frame (usually the ufn itself!). Breakdown as follows: MVUNBIND simulates UNBIND and copies values downward on stack. Net pop, so not too ugly. MVRETURN requires pushing onto returnee, so have to make sure there is space. If returner is what follows returnee, we can smash it out of the way; if free block of sufficient size follows, shrink the free block; else have to move returnee, much as \HARDRETURN does. I'm inclined to have MVRETURN ufn to \HARDRETURN and fold all this garbage into one place. MVPREPARE grabs caller's FX and sets the MVA flag in it; MVCLEAR clears the flag. (Do these confuse the compiler? They are zero-argument opcodes that return zero values.) MVMARKER is fairly easy, since it just requires smashing contents of TOS. MVFORCE either smashes TOS or pushes a value. Do this by having zero-argument ufn (as all the others are), which examines caller's TOS. If it is the MVMarker, smash it and set NoPush bit as you return; if it is not, return constant 1 in conventional manner. STACKADJUST can actually have a ufn of one argument (n). If n is positive, there is a net pop, which is relatively easy; if n is negative, need to push NILs, which gets into the same hairiness as MVRETURN and would probably require doing something in the misc context if there is a frame blocking the way. This is perhaps the least pleasant opcode. Compiler. Compiler needs to add notion of Multiple Values Possible context, Lap code for each of the constructs described above. Need to inhibit certain optimizations (COPY.N and STORE.N in certain situations). Interpreter. Need to implement interpreted versions of all of the Multiple Values forms listed above, and make sure the existing interpreted fns transmit multiple values when required to.  ($( (00100H($((( (00(00(((GACHA TIMESROMAN TIMESROMAN GACHA  TIMESROMAN  TIMESROMAN  TIMESROMAN  +B   / \ w  .@  4 s HV m//.% 0 V  "9m B3+$ ! 5 $ IU  X$+#  ,  r     '      c      , : K.  (  2 P  2~<V!  [; le(  K1j] .e %  #i %"   ;Y    q V*  : t   4 - U    7  @!     6S t q   ? 53   P+ 93 3A"4(  $#q!t  0Tf x2  D\-   HT-    M) ' 9 r s * OH  R! 1 =K& h~^ {W  N  )  %W@p 2 @  F r '  sJ [ R  *  . %\  6   L  =     @  o   !      W3 !. ( ! 3D=I: A  # ( 7   K R        o    q)   #         ;VB )  %$ 6  @    / ( 2"G  9s`# V   (E HQ &# 5kg (5X :[I      ^ 2      A    + 0m B  )   z