2. What really happens
Nitty gritty details
The patch space for each breakpoint contains code to save the global state of the machine, call your procedure, and restore the global state. This code, called the closure caller, has the same form for each breakpoint, though the details of the particular procedure to call, and the client data for the call, are ``compiled into'' the code, so each patch space is unique. There are, so far, two different types of closure callers: one for speed, and one to help out the debugger. the only difference is that the one that helps out the debugger sets the return address register to the address of the breakpoint, so the deubgger can make sense of the extra frame that the breakpoint code pushes onto the register window and memory stacks to hold the saved state around the closure call.
The closure caller looks like (this one has the setting of the return address register):
save %sp,-
(stackPointerOffset+stackAllocationForCallee+registerSaveArea),%sp
-- allocate a register window and a register save area.
call ←save←regs
add %sp, stackPointerOffset+stackAllocationForCallee, %o0
-- save←regs(@registerSaveArea).
sethi %hi(returnress), %i7
or %i7, %lo(returnress), %i7
-- set up ``return address''.
sethi %hi(clientData), %o0
call clientProc
or %lo(
clientData), %o0
-- clientProc(clientData).
call ←restore←regs
add %sp, stackPointerOffset + stackAllocationForCallee, %o0
-- restore←regs(@registerSaveArea).
restore %sp,+(stackPointerOffset+stackAllocationForCallee+registerSaveArea),%sp
-- deallocate the register window and register save area.
where the constants stackPointerOffset and stackAllocationForCallee are defined and described in SPARCArchitecture.mesa, and registerSaveArea is defined in either SPARCBreakpoint.mesa or CirioBreakpoint.mesa, depending on which closure caller you use. Note that registerSaveArea really depends on the runtime routines for saving and restoring registers (available though CirioThings-Suite.df), and should be gotten by calling ←reg←state←size, but I don't do that because I generate the code on one side of the wall, and ←reg←state←size lives on the other side of the wall. There are actually several resgister saving and restoring procedures available. The most general one, ←save←regs, saves the floating point registers, the in registers of a new register window, and the global registers. If you know you aren't going to be touching, say, the floating point registers in your breakpoint procedure, you could make up a closure caller that called ←save←regs←mini, which just saves the in and the global registers. Then you need to call ←restore←regs←mini to restore only those registers. Not saving and restoring the floating point registers saves more than half the time of the breakpoint code.
Following the closure caller in the patch is the code, called the manger, to simulate the instruction that was replaced with the branch to the patch. So far I've distinguished five different classes of instructions on which breakpoints can be set, and each has it's own manger. Each class of instruction also needs a different instruction to transfer to patch code, so those two issues are discussed together. The different mangers come in different sizes, though for reasons I'll explain below, I always allocate the maximum size.
The simplest manger is for an instruction which does not have a delay slot. All that is needed to transfer to the patch space is a
ba,a instruction (assuming the patch space is within 16M bytes of the breakpoint address). The manger for the normal case looks like:
instruction' -- relocated instruction.
ba,a continue -- branch to the instruction after the breakpoint.
Most control transfers on the SPARC have a delay slot, in which the instruction after the transfer instruction executes before the transfer takes effect. Conditional control transfer instructions must be relocated because they use PC-relative addressing of the destination. We must not execute the instruction in the delay during the transfer to the patch space, but must execute it in the delay slot of the relocation transfer instruction in the patch space. The manger for conditional branch instructions (generically,
bicc) looks like:
bicc destination -- relocated conditional branch instruction.
delaySlot -- the instruction in the delay slot of the original branch.
ba,a continue -- branch to the instruction after the original delay slot.
The transfer to the patch is accomplished with a ba,a instruction, as in the normal case.
The
call instruction is a delayed control transfer which also writes the address of the instruction (the return address) into the return address register (register %o7). For various reasons we need the return address pointing at the reall return address, so we can't just use an ordinary call instruction from the patch space. However, we know the address that is being called (by decoding the instruction). The technique we use is to use a
call instruction to transfer to the patch space, and have the patch space transfer to the called procedure with a jump. The instruction in the delay slot of the original call is executed in the delay slot of the call to the patch space. That's okay, since the instruction in the delay slot of a call is usually part of the argument set up, since the caller can't know the first instruction of the callee. The manger for the patch a ``
call Foo'' contains:
sethi %hi(Foo), %g1 -- set up %g1 with called procedure address.
jmpl %g1+%lo(Foo), %g0 -- transfer to Foo, without saving a return address.
nop -- needed, since the jmpl has a delay slot.
Calls can also be made with the address of the procedure in a register (or at an offset from a register). These are handled similarly to the direct call case above, with a call to the patch space. The manger for a patch of a ``
jmpl %r+offset,%o7'' contains:
jmpl %r+offset, %g0 -- transfer to %r+offset, without saving a return address.
nop -- needed, since the jmpl has a delay slot.
where the ``%r+offset'' operand is just copied from the original jmpl instruction. As in the direct call case, the instruction in the delay slot of the original call is executed in the delay slot of the call to the patch space.
Returns are handled similarly, since they are almost indistinguishable from indirect jumps, except that they don't save a return address. Since they don't save a return address, the transfer to the patch space is done with a
ba,a instruction, and the instruction in the delay slot of the return is relocated to the delay of the return in the patch space. The manger for a patch of a
ret instruction (which is a pseudo-op for a ``
jmpl %i7+8,%g0'') contains:
jmpl %i7+8, %g0 -- transfer to %7+8, without saving a return address.
delaySlot -- the instruction in the delay slot of the original return.
The case of a retl instruction (which is a pseudo-op for a ``jmpl %o7+8,%g0'') is handled similiarly, by changing the source operand of the jmpl instruction in the manger.
The different types of mangers are distinguished by writing an extra word at the end of the manger which happens to be a ``sethi %hi(tag), %g0'', where tag is a member of the enumeration SPARCManger.MangerVariant. This instruction is never actually executed, but it is an executable instruction for historical reasons. When it comes time to uninstall a breakpoint, the original instruction can be reconstructed by examining the tag and doing the appropriate reconstruction from the instructions in the manger. Patches are also prefixed (by the Shepherd code) with the address of the breakpoint, so that completes the information needed to uninstall a breakpoint.