Chapter 6 Complete Navigator Algorithms This chapter describes the Navigator implementation, expanding on the overview given in the previous chapter. It presents the Navigator compiler and debugger algorithms for handling inline procedure expansion and cross-jumping, including the complications caused by the language, the system environment, and multiple optimizations. Chapter 7 analyzes the compiler and debugger path determination algorithms, which distinguish among multiple source alternatives in regions merged by cross-jumping. Chapter 7 demonstrates the correctness of the algorithms, analyzes their efficiency for repeated cross-jumping, and discusses some possible improvements to them. 6.1 Cedar compiler This section describes the overall organization of the Cedar compiler and its data structures. Later sections will explain the additional data structures used by the Navigator compiler. The Cedar compiler is a six-pass compiler with a typical organization. Figure 6-1 shows the compiler's overall structure, along with the responsibilities of each pass. Input scanning and parsing. Pass 1 scans and parses the input program, creating an abstract syntax tree representation of the program. Declaration processing. Passes 2 through 4 operate on the abstract syntax tree. Pass 2 builds a symbol table. Pass 3 performs type-checking and inline procedure expansion. These operations may require consulting symbol tables or abstract syntax trees created by previous compilation of definitions modules. Pass 4 allocates variables to stack frame positions according to their static frequency of use, since the first few local variables can be accessed more efficiently than the rest. fig61.press leftMargin: 2.3125 in, topMargin: 0.25 in, width: 4.625 in, height: 8.5 in Figure 6-1. Structure of the Cedar compiler. Code generation and improvement. Pass 5 generates object code for a fairly simple stack machine, one procedure at a time. It has two phases. Pass 5A generates an abstract code stream from the decorated abstract syntax tree. Pass 5B performs a number of object code optimizations [Wulf+75], computes branch lengths, and does final opcode selection. Output. Pass 6 builds the statement maps and writes the object code, the symbol table, and the statement maps to the output file. 6.1.1 Data structures The compiler uses three major data structures: the abstract syntax tree, the symbol table, and abstract code streams. It also produces a fourth data structure, the statement map, for use at runtime by the debugger. The abstract syntax tree, produced by the parser in Pass 1, is doubly-linked so that it is possible to access a node's parent or children with equal ease. Each abstract syntax tree node that corresponds to a statement in the Cedar language, as defined by the Cedar grammar, will be called a statement node. The subtree rooted at a statement node contains a complete description of that source statement. The symbol table, constructed from the abstract syntax tree in Pass 2, is also tree-structured. Its nesting captures the static name-inheritance of Cedar scopes. Each symbol table entry contains a complete description of the meaning of some name within the enclosing scope: the name and its type are important components. Storage allocation information also resides in the symbol table; it is added in Pass 4. Symbol table entries are themselves tree-structured, to express structured types such as procedures, arrays, and records. Abstract code streams, created for each procedure by the code generator in Pass 5A, are represented as doubly-linked linear lists of two types of cells: instruction cells and marker cells. Each instruction cell contains a complete description of one machine instruction (either a jump instruction or a code instruction). A marker cell does not correspond to any object code; it is either a label marker or an info marker. Each jump instruction has a pointer to its destination cell, which is a label marker. All jumps to the same label are linked together so that given either a jump or a label, all jumps to that label can be found. Info markers are used to record information for statement maps and symbol tables. An abstract code stream may also be referred to as just a code stream. Statement maps are created from the abstract code stream in Pass 6. In the Cedar compiler, a statement map is a table of entries, each consisting of a source statement number and the object code location corresponding to the start of its execution. The statement map is stored as a list of (source increment, object increment) pairs, so it allows only monotonically increasing values for statement numbers and object locations. This structure is space-efficient, but it cannot represent one-to-many and many-to-one source-to-object mappings. For the Navigator compiler, the statement map representation was changed to allow full source statement numbers and full object code addresses and to allow lists of statement numbers for a single object code location. 6.1.2 Keeping track of source correspondences The movement of source correspondence information through the compiler is crucial to truthful debugging. There are three major steps. From input program to abstract syntax tree. The compiler's input scanner, in Pass 1, records the source position of the beginning of each token. As the parser constructs the abstract syntax tree, also in Pass 1, it fills the sourcelink field of each statement node with the recorded token start position of the first token in the statement. Passes 2 through 4 modify the abstract syntax tree, but do not alter the sourcelink fields (except for the effects of inline procedure expansion in Pass 3, which will be discussed shortly). From abstract syntax tree to code streams. Pass 5A adds source information fields to every instruction cell, copying the sourcelink information from the corresponding parse tree statement node. The source information for the first instruction in every statement also carries a special isStart designation. [Note: for descriptive ease, the algorithms described in this chapter assume that each instruction cell has its own source information. In the actual Navigator implementation, sourcelink marker cells are interspersed with the instruction cells to carry the source information more efficiently. Section 6.4.5 discusses the changes necessary to handle these separate sourcelink cells.] From code streams to output statement maps. In Pass 6, the output routines step linearly through the abstract code streams for each procedure, writing the contents of the instruction cells to the object code file. Marker cells do not appear in the output object code. Based on the source information in the instruction cells, the compiler records the source and object correspondences in the statement map. After all of the object code has been emitted, the completed statement map and the symbol table are also written to the object code file or to a related file. 6.2 Inline procedure expansion The compiler performs inline procedure expansion on the abstract syntax tree representation of the program, as part of Pass 3. For each call to an inline procedure, the compiler assigns the actual parameters of the call to the formal parameters of the inline procedure and replaces the call by the statements from the body of the inline procedure. This section describes how the Navigator compiler and debugger handle the effects of inline procedure expansion. Section 5.7.1 gave an overview of this process; this section presents more detail. Additional changes are needed to handle nested inline procedures, which are inline procedures that are called from other inline procedures, and imported inline procedures, which are inline procedures declared in separate modules. In addition, this section briefly considers the issues of further optimizations on the expanded statements. Recall that in Cedar, a programmer explicitly marks procedures that should be expanded; the compiler does not select which procedures or procedure calls to expand. This has three consequences. First, procedures that are not marked as INLINE procedures will not be expanded, even if they are as small and called as frequently as marked procedures. Second, every call to a marked procedure is expanded, rather than only calls that are expected to execute frequently (such as calls inside loops). Third, the Cedar compiler does not generate stand-alone object code for inline procedures. As mentioned in Chapter 5, inline procedure expansion affects the following primary debugging functions: setting a breakpoint on a statement in an inline procedure, reporting the execution location at such a statement, and providing a procedure traceback that logically includes an expanded inline procedure. Inline procedure expansion also affects several secondary debugging functions: setting breakpoints at the entry or exit of inline procedures, calling inline procedures from the debugger's interpreter, displaying the values of parameters and variables when inside expanded inline procedures, and single-stepping with coarse granularity. The Navigator compiler's inline procedure expansion algorithm (Algorithm NI1, shown below) handles the primary debugging functions. Extensions to handle the secondary debugging functions are discussed at the end of this chapter. The changes from the original Cedar algorithm appear in italics. The simpler changes (not underlined) were explained in Section 5.7.1. The underlined changes are explained following the statement of the algorithm. These additional changes are needed to handle nested inline procedures and imported inline procedures. fig62.press leftMargin: 1.75 in, topMargin: 1 in, width: 5.6875 in, height: 8.5 in Figure 6-2. Inline procedure expansion in the abstract syntax tree form. Figure 6-2 shows the abstract syntax tree of the program from Figure 5-3 before and after the application of Algorithm NI1. The value of a statement node's sourceLink field (which indicates the corresponding source statement) is shown at the lower right of the node; the value of the inlineCallId field appears as a superscript on the statement number. 6.2.1 Algorithm NI1: Navigator compiler inline procedure expansion algorithm Data structures: A new Inline Table with fields calleeName, callerStmt, callerCallId, and declFileId. The table entry number (callId) is also used. The debugger uses the calleeName field to report the name of the innermost procedure when it displays the current execution location. It uses the callerStmt field to report the statement number in the caller when it displays a procedure traceback. If the caller's caller is also an inline procedure (i.e., an inline procedure calls another inline procedure), the debugger uses the callerCallId field to report location information for the caller's caller. See Section 6.3.2.1. If the callee is declared in a separate module, the debugger uses the declFileId field to access the source text for the callee's statement number. See Section 6.3.2.2. A statement node's sourceLink field is expanded to contain a new field inlineCallId. The information from the sourceLink field will be propagated into the statement maps. Statement map entries for expanded statements contain the statement number inside the callee and the new inlineCallId field, which the debugger uses to access the information in the inline table. 1. Walk the abstract syntax tree of the module. 2. Whenever a call node to an inline procedure (determined by looking in the symbol table entry for the procedure identifier) is encountered: 2A. Create a new inline table entry. 2A1. Set calleeName = the name of the called procedure. 2A2. Set callerStmt = statement number of the calling statement (i.e., the sourceLink field of the call node). 2A3. Set callerCallId = the number of the inline table entry of the caller, if any. 2A4. Set declFileId = a pointer to the recorded filename and timestamp of the interface module in which the called procedure is declared, if any. 2B. Create a new block node as follows: 2B1. Allow a new naming scope for the procedure statements. Copy the parameter and declaration subtrees of the inline procedure's declaration node into the declaration subtree of the block node. 2B2. Substitute actual parameters for formal parameters. 2B2a. Insert statements assigning the actual parameters of the call node to the formal parameters of the inline procedure as the first elements of the block's statement list. Set the sourceLink fields of these assignment statements to the sourceLink field of the inline procedure's declaration.1 This allows program exceptions that occur during parameter storing to be reported in the callee. This distinction is more important for languages that delay parameter type-checking until runtime. 2B2b. Inside each assignment statement, insert an eval node above the evaluation of each actual parameter. Set its sourceLink field equal to the sourceLink field of the call node. This allows program exceptions that occur during parameter evaluation to be reported in the caller. 2B3. Expand the statements of the procedure. 2B3a. Copy the statement list subtree of the inline procedure's declaration node into the block's statement list. 2B3b. 2B3c. For each copied statement node, set the sourceLink field's inlineCallId = the index of the new inline table entry. 2C. Replace the call node by the newly-formed block node. Notes: 1 In the unmodified Cedar compiler, step 2B2a set the sourceLink fields of these assignment statements to the sourceLink field of the call node. There was no step 2B2b. 2 In the unmodified Cedar compiler, step 2B3b replaced the sourceLink field of each copied statement node by the value of the sourceLink field of the call node. 6.2.2 Two additional complications: nested and imported inline procedures The Cedar language allows an inline procedure to call another inline procedure [of course, an inline procedure cannot call itself]. Inline procedures can also be declared in separately-compiled modules. This section discusses the additional information that must be recorded to handle these language features. 6.2.2.1 Nested inline procedures In general, a procedure has multiple potential callers. The runtime procedure call chain records (indirectly, via the return address) which procedure was the caller at any given procedure invocation. In the case of a call to an inline procedure, the caller is fixed at compile-time. As so far described, the statement maps record the statement number in the callee, and the inline table records the name of the callee and the statement number in the caller. The name of the caller of an inline procedure is retrieved from the runtime procedure call chain. However, an inline procedure can call other procedures, which may or may not also be declared to be INLINE. When an inline procedure calls another inline procedure, neither the caller nor the callee appear in the procedure call chain. Therefore there is another field in an inline table entry, callerCallId, which gives information about the caller. If callerCallId is 0, the caller is in the procedure call chain. If callerCallId is not 0, this call is a nested inline call and the callerCallId entry of the inline table describes the caller. Figure 6-3 shows an example of a nested inline procedure. The procedure SetX is called from the procedure SetXY. Note that the subsumption optimization [see Section 6.2.3] was performed after the call to SetX was expanded; the actual parameter j in SetXY was substituted directly for the formal parameter i in SetX. SetX is also called from the ordinary procedure AdjustXY. 6.2.2.2 Imported inline procedures The separate compilation feature of the Cedar language adds another complication. In particular, an inline procedure is a special case of a compile-time constant. Inline procedures can be declared in definitions modules, and can be imported and called in separate program modules. In this case the inline procedure bodies must appear in the definitions module. Compiling a definitions module produces a symbol table containing an abstract syntax tree for each inline procedure declared in it. The optimization process is not materially altered. When a call node for an imported inline procedure is encountered in the abstract syntax tree of an definitions module, the compiler finds the appropriate file, opens it, and copies the procedure's abstract syntax tree from the symbol table in the same way as before. The compiler records the filename and timestamp of the definitions module in the implementation module's symbol table for version-checking purposes. However, imported inline procedures present several additional debugger difficulties. When an imported inline procedure is called, the source statements for the inline procedure are in a different file than the statements of the calling procedure. To determine the execution location inside an imported inline procedure, the statement numbers in the statement maps refer to the interface module rather than the implementation module. This fact is indicated by an additional field declFileId in the inline table entry for a call. If declFileId is not nil, it points to the recorded filename and timestamp of the definitions module. fig64.press leftMargin: 1.5 in, topMargin: -0.3125 in, width: 6 in, height: 8.5 in Figure 6-3. Nested inline procedure expansion with subsumption. The harder problem for imported inline procedures is setting breakpoints. When a user requests a breakpoint on a statement inside an imported inline procedure, the resulting breakpoint(s) must be set in an unknown group of program modules. During compilation, the compiler records the dependence from program modules to definitions module in the object code file; it does not record all uses of definition information (from definitions module to program modules). When the binder forms a program from a group of modules, it records this use information. Furthermore, since many programs can be loaded and executing at one time that use the same imported inline procedure, the loader records another level of use information. The combination of binder and loader information of this sort is called the runtime model. The debugger queries the runtime model to discover all currently loaded programs that import a given inline procedure, and sets breakpoints in each module as before. A system without a runtime model could use a partial solution to set breakpoints inside imported inline procedures. If the user specified a program module M and a statement s inside an inline procedure, the debugger would set breakpoints at all expansions of s inside M. The resulting behavior would not always be expected, but this approach would provide the necessary functionality. Figure 6-4 shows an example of an inline procedure declared in a definitions module. The procedure SetX is declared in the definitions module D, and is imported and called in the program module M (from the procedure SetX). 6.2.3 Applying further optimizations to the expanded statements Additional optimizations can be applied to the expanded statements. These additional optimizations fall into two categories: statement simplification based on the actual parameters to the call, and statement simplification and reordering based on the statements surrounding the call. In many cases it is safe to substitute the actual parameters directly for the formal parameters, rather than inserting assignment statements. This optimization, called subsumption [Loveman77], is a form of propagation (see Chapter 3). Subsumption makes it difficult for a debugger to display the current value of a formal parameter inside an expanded inline procedure [a secondary debugging function only peripherally considered by Navigator]. The Cedar compiler applies parameter subsumption only in very restricted cases: when the actual parameter is either a compile-time constant or a fig65a.press leftMargin: 1.4375 in, topMargin: 0.59375 in, width: 6 in, height: 8.5 in Figure 6-4. Inline procedure declared in definitions module. variable value and the compiler can guarantee that the call by name semantics implemented by this method is the same as the ordinary call by value semantics. The user can determine these values either by examining the text of the procedure call or by requesting the value of some variable in the caller from the debugger (except when the procedure has side effects). This behavior is not expected behavior; in some cases it may not be truthful behavior. Section 6.5.1.1 discusses how to extend Navigator to handle this problem. An expanded statement from the body of the inline procedure or a portion of one can be deleted if its results are not needed for this particular call. An expanded statement or a portion of one can also be reordered, possibly even moved outside of the expanded procedure to mingle with statements from the calling procedure. The Cedar compiler applies statement-based optimizations in the form of cross-jumping and the other object code optimizations. The effects of these later optimizations will be discussed in Section 6.5.2. 6.3 Cross-jumping In Pass 5A, the compiler generates the abstract code stream for a procedure by walking the procedure's abstract syntax tree. In Pass 5B, the compiler repeatedly performs several object code optimizations, including cross-jumping, on the code stream until the optimizations are no longer applicable. The other optimizations in this iterative process are: replacing a conditional jump around an unconditional jump by an opposite-sense conditional jump, removing branch chains and jumps to the next location, and examining groups of adjacent instructions for opportunities to delete instructions (notably instructions that pop the computation stack) or to combine them into a more powerful single instruction. All of the optimizations preserve the actual ordering of computations along any execution path, although the control flow may be altered. As a result of this compiler organization, the cross-jumping optimization can be applied repeatedly to a single section of the code stream. Furthermore, cross-jumping can be applied to sections of code stream that have already been transformed by other optimizations, and vice versa. This section first describes the details of the simple cross-jumping and path determination algorithm sketched in Section 5.7.2. The simple version of the algorithm does not correctly handle the application of cross-jumping to a previously cross-jumped region. It then considers several problems that manifest themselves even on the simple version of the algorithm. These problems include two problems that arise during compilation (ambiguous programs and externally-accessible labels) and two problems that arise during debugging (multiple processes and/or recursion, and determination cell allocation). Then the full version of the cross-jumping and path determination algorithm is presented. The full version collects additional information to track cross-jumping's control-flow transformations through repeated applications to the same region. The section concludes with a description of how to perform the path determination algorithm using a modified abstract code stream that represents source information more efficiently. The next section discusses the additional effects of the interactions between inline procedure expansion, cross-jumping, and the other object code optimizations. 6.3.1 Graph notation and terminology In a code stream, an instruction x is the immediate static predecessor of an instruction y if x is the instruction positioned before y in the stream. An instruction x is an immediate dynamic predecessor of an instruction y if x can execute immediately preceding y. That is, either x is a jump to y, or x is the immediate static predecessor of y and not an unconditional jump to some other instruction. An instruction has exactly one immediate static predecessor; it may have many immediate dynamic predecessors. Static and dynamic successors are defined similarly. A code sequence is one or more statically adjacent instructions. A sentry of a code sequence is any instruction outside the sequence that is an immediate dynamic predecessor of an instruction inside the sequence. An entrance of a code sequence is any instruction inside the sequence that is an immediate dynamic successor of a sentry of the sequence. That is, a sentry of a code sequence is the last instruction executed outside the sequence on some control-flow path, and an entrance is the first instruction executed inside the sequence on that path. An extended basic block is a maximal code sequence that has the following property: there may be multiple exit points, but there is a single entry point at the beginning of the block; there is no internal non-sequential control flow. If two instructions x and z are in the same extended basic block and z is a static successor of x, then if z executes, x must have executed in the same iteration through the block. An extended basic block is more general than an ordinary basic block because of the possibility of multiple exits. 6.3.2 Simple cross-jumping and path determination algorithm To ensure that readers have the basic idea firmly in mind before more complications are added, this section presents the simplified cross-jumping and path determination algorithm whose effects were described in Chapter 5. This version of the algorithm does not correctly handle the application of cross-jumping to a previously cross-jumped region. 6.3.2.1 Discussion of the algorithm Ignoring the collection of path determination information for a moment, the Navigator compiler's version of the cross-jumping algorithm examines the abstract code stream sequentially. When it finds a label, each immediate dynamic predecessor of the label defines the end of a path to that label. The algorithm compares every pair of joining paths for identical tail code sequences. It finds identical code sequences by searching the two paths in reverse order until it encounters unequal instructions, an unconditional jump, a backward conditional jump, or a label. The first of these conditions is clearly necessary for any form of cross-jumping: only equal instructions can be merged. The other three conditions are necessary to ensure that the path determination algorithm can reconstruct the original control flow from the optimized program and the path determination information. Terms are needed to identify key items of the merging sequences: A path is a code sequence that ends in an immediate dynamic predecessor of the joining label. A path is called a fall-through path if it precedes the label both statically and dynamically; otherwise it is called a jump-in path. A jump-in path ends in a joining jump. If the label's immediate static predecessor is an unconditional jump to some other label, there is no fall-through path. The link cell of a path is the label cell if the path is a fall-through path; otherwise it is a jump cell whose jump destination is the label. All paths whose link cells are conditional jumps are removed from consideration because the static successor cells must execute when the condition is false. The bottom cell of a path is the immediate static predecessor of the link cell. Continuing with the cross-jumping algorithm, when the compiler finds identical code sequences, it designates one code sequence as the to-delete sequence, and the other as the to-remain sequence. To avoid introducing extra jumps, if one of the paths is the fall-through path, it is selected as the to-remain path. The compiler replaces the to-delete sequence by a crossing jump from the beginning of the to-delete sequence to a crossing label at the beginning of the to-remain sequence. This effectively merges the identical parts of the two sequences. Each of the identical code sequences currently under consideration is called a merging sequence. Given one of the sequences, the other sequence that participates in the merge is called its opposing sequence. The main features of the path determination algorithm are the following: 9 Whenever two instructions are merged, the algorithm retains source information for both original copies of an instruction in the instruction that remains. 9 Each of the two separate source alternatives for a merged instruction is marked with its sequence's path determiner identifier. 9 Every sentry of each sequence is marked as a path determining instruction for that sequence. When more than two paths join at a label, the order of path comparisons is important. The algorithm compares paths in the order that the jumps to the joining label were generated (which is essentially arbitrary), rather than based upon any properties of the merging sequences (which are as yet unknown). As long as the fall-through path is never a to-delete path, the algorithm produces the same object code regardless of the choice of ordering. However, a poor choice of ordering can require more iterations of cross-jumping to completely merge a set of paths that join, because embedded labels are excluded from a merging sequence. A poor choice of ordering has worse consequences for the path determination algorithm; Section 7.2 considers this issue in detail. Figure 6-5 shows five execution paths that join at the label L and describes one possible ordering of the path comparisons. It also describes what matching sequences are found and what path determiners are assigned to each sequence. Figure 6-6 labels the merge of paths 1 and 2 from Figure 6-5 with the terms defined above. fig65c.press leftMargin: 1.8125 in, topMargin: 0.84375 in, width: 5.5 in, height: 8.5 in Figure 6-5. Ordering of path comparisons for cross-jumping. fig65b.press leftMargin: 1.75 in, topMargin: 0.34375 in, width: 5.8125 in, height: 8.5 in Figure 6-6. Anatomy of a merge: paths 1 and 2 of Figure 6-6. Data structures for cross-jumping and path determination The essential properties of the compiler's abstract code stream for the application of the cross-jumping and path determination algorithms described in this chapter are: 9 The algorithms process only instruction and label cells in the code stream, ignoring any other items in the code stream. 9 Each instruction cell has source correspondence information embedded in it. 9 The dynamic connectivity of the code stream is explicit. That is, labels appear at all destinations of non-sequential control-flow, and labels and jumps are linked together in both directions so that dynamic predecessors and successors can be found easily. Instruction y can execute after instruction x if and only if 9 y is the immediate static successor of x (and x is not an unconditional jump) or 9 y is the immediate static successor of a label L, and x is a jump to L. 9 There are never two label cells in a row. This condition allows jump destinations to be compared simply; it must be enforced by the (centralized) routines that insert labels or delete cells. These routines may need to alter the name of a jump's destination. We now describe the modifications necessary to the Cedar compiler's data structures to record the path determination information. Figure 6-7 shows the format of the three types of code stream cells. The fields nextCell and prevCell link the code stream statically, while the fields jumpDest, nextJump, and firstJump link the code stream dynamically. The shaded boxes indicate the fields that are added for Navigator. The boxes with the dark borders indicate the fields that are needed only for repeated cross-jumping; these fields are described with Algorithm NCJ2, in Section 6.3.4. An instruction cell's source information is expanded from a single source descriptor (with fields stmtNum and isStart) to a list of source descriptors (via a sourceList field). The isStart field is true if and only if this instruction represents the start of execution of the statement. Each source descriptor is expanded to contain a new field pdToUse. If the value of a source descriptor's pdToUse field is p, we say that the instruction, or more specifically the source descriptor, uses determiner p. We may also say that the source descriptor has a determiner use for p. fig61a.press leftMargin: 1.5 in, topMargin: .375 in, width: 5.1825 in, height: 8.5 in Figure 6-7. Data structures of the abstract code stream cells. The pdToUse field ends up as a path determiner use in a source descriptor in the object table. A path determiner use looks backward along some execution path to a dynamically preceding path determiner assignment (see below). The debugger utilizes the determiner use in two ways: first, to tell which path determiners to activate when a primary breakpoint is set at that instruction, and second, to distinguish between the different statement numbers in the source descriptors when that primary breakpoint is reached. Each instruction cell is also expanded to contain new sentry information (via a pdToAssignList field). Each sentry descriptor has a single field pdToAssign. If the value of a sentry descriptor's pdToAssign field is p, we say that the instruction, or more specifically the sentry descriptor, assigns determiner p. We may also say that the sentry descriptor has a determiner assignment for p. A nonempty pdToAssignList indicates that this is a sentry instruction for some merged region(s). Therefore the debugger should collect dynamic control-flow information at this instruction, via a determining breakpoint, whenever the user places a breakpoint in an associated merged region (the association is shown by the information in the object table). The determining breakpoint will collect information for each path determiner that has a determiner assignment in the pdToAssignList. Each pdToAssign field (determiner assignment) becomes an entry in the path determiner table. This entry contains the path determiner identifier mentioned in the pdToAssign field and the object location of this sentry instruction. The debugger uses the object location of the sentry in two ways: to place determining breakpoints when a path determiner is activated, and to ascertain which determination cells hold the timestamp value of a path determiner when a primary breakpoint is being processsed. Implications of the cross-jumping algorithm's termination conditions Notice that the choice of conditions on which to stop the code sequence comparison implies that all of the instructions in a given sequence are in the same extended basic block. The path determination algorithms described in this chapter rely heavily on this restriction. It is a slightly unusual restriction: the Bliss-11 cross-jumping algorithm allowed both unconditional jumps and labels embedded in a merging sequence [Wulf+75]. Chapter 7 discusses the additional path determination processing necessary to relax this restriction. In fact, the termination condition is a bit stronger than leaving the extended basic block, because unconditional jumps are not permitted inside either merging sequence (except, of course, for the joining jump at the end of the sequence). Unconditional jumps are not permitted for two reasons. First, if there were an unconditional jump inside a merging sequence, the statically following instruction(s) in the sequence would be unreachable, because labels are not permitted inside a merging sequence and all non-sequential control flow is assumed to be marked explicitly by labels. The unreachable instructions can be removed by another object code optimization. Second, if two unconditional jumps match and therefore could be included in the merging sequence, then they jump to the same label and can be cross-jumped directly later. Note that this situation is not true of embedded conditional jumps, because both branch destinations must be identical for a match to occur: a conditional jump as the joining jump of one merging sequence does not match an identical instruction as the joining jump of the opposite sequence, while a conditional jump embedded in a merging sequence does match an identical conditional jump in the opposite sequence. To avoid undeterminable control flow, backward conditional jumps are not permitted inside merging sequences. [This discussion of conditional jumps assumes that they have only two immediate dynamic successors. Multi-way conditional branches are a bit different.] We now show the details of the simple cross-jumping and path determination algorithm that was sketched in Section 5.7.2. The following high-level presentation of the algorithm, shown as Algorithm NCJ1, is intended to be read. The subsequent material in this chapter discusses its implications and fine points. Steps that were added to the original Cedar cross-jumping algorithm to collect the information for path determination appear in italics. 6.3.2.2 Algorithm NCJ1: Simple cross-jumping and path determination algorithm Define a cell match to occur between two instruction cells whenever both cells 1) have the same opcode and operand fields, 2) are not unconditional jumps, and 3) are not backward conditional jumps. Two forward conditional jumps are equal if they have the same condition, the same operand(s), and the same destination label. A label cell does not match any other cell. 1. Step forward linearly through the code stream of the procedure. 2. Whenever a label cell is encountered: 2A. Find the bottom cell in each unconditional path to the label. 2B. For every pair of paths to the label, neither of which was previously a to-delete path: 2B1. If the two bottom cells match, BEGIN Cross-jump 2B1a. Designate one path as the to-remain path, and designate the other path as the to-delete path. To avoid introducing extra jumps, if one of the paths is the fall-through path, it is selected as the to-remain path. 2B1b. Assign a new path determiner identifier to each of the two paths. 2B1c. Delete the joining jump (i.e., the link cell on the to-delete path). 2B1c1. Keep the joining jump's source information (see Section 6.3.2.3, below). 2B1d. On each path, set the current cell pointer to the bottom cell. 2B1e. As long as the current cells on the two paths match (note that all cells in each merging sequence are in the same extended basic block): BEGIN Merge one instruction 2B1e1. [Keep source information.] On each path, set the current cell's pdToUse field (in the single source descriptor for the cell) to the path determiner of that path. Add the to-delete cell's source descriptor to the to-remain cell's sourceList, making two source descriptors for the to-remain cell. 2B1e2. Delete the current cell on the to-delete path from the code stream. 2B1e3. On each path, move the current cell pointer backward in the code stream to its immediate static predecessor cell. END Merge one instruction 2B1f. Mark the sentry (or sentries) for the to-remain path as follows: 2B1f1. Find all immediate dynamic predecessors of the top merged cell. (Note: the top merged cell is on the to-remain path; it has not yet been connected to the to-delete path.) 2B1f2. Add a determiner assignment for the to-remain path determiner to the pdToAssignList of each such cell. 2B1g. Make the top merged cell jump-addressable by inserting a label cell before it if there isn't already one there. Call the label cell the crossing label. 2B1h. Insert the crossing jump (an unconditional jump cell to the crossing label) after the current cell on the to-delete path. 2B1h1. Create source information for the crossing jump: If its immediate static predecessor is not an unconditional jump, copy its source information. Otherwise, copy the source information from the to-delete path of the top merged cell. Do not mark it as the start of a source statement. 2B1i. Mark the sentry for the to-delete path as follows: 2B1i1. Add a determiner assignment for the to-delete path determiner to the crossing jump's pdToAssignList. END Cross-jump 6.3.2.3 Step 2B1d1: Keeping the joining jump's source information Because the joining jump (i.e., the link cell on the to-delete path) may not have a corresponding instruction on the to-remain path, keeping its source information can be difficult. This section describes the method Navigator uses to deal with this problem. If the to-remain path is a jump-in path, there is no problem: the joining jump has a corresponding instruction on the to-remain path (the link cell), and thus these two instructions can be treated like any other instructions in the merging sequences. The algorithm adds the appropriate determiner use to each instruction's source descriptor, and then adds the joining jump's source descriptor to the source descriptor from the to-remain path's link cell (as in step 2B1e1). If the to-remain path is a fall-through path, then there will be no remaining instruction cell corresponding to the joining jump. But if the joining jump does not represent the beginning of a source statement, the user cannot request a breakpoint there anyway. In this case, the joining jump can safely be deleted without keeping its source information. However, if the to-remain path is a fall-through path and the joining jump represents the beginning of a source statement, then the user might request a breakpoint on the joining jump. The goal is to have such a breakpoint appear in the expected place in the program's execution. Possible solutions to this situation include: moving the joining jump's source information backward to a dynamically preceding instruction, which could produce less than expected behavior when fielding a breakpoint; swapping the to-remain and to-delete paths, which could add as many as two additional jump instructions to the resulting code; and inhibiting cross-jumping altogether when the joining jump corresponds directly to a source statement, resulting in an unknown increase in program size. Navigator instead replaces the joining jump with a null instruction, then inserts a corresponding null instruction in the to-remain sequence. The to-delete null instruction gets the source information of the joining jump, and the to-remain null instruction gets the source information of its immediate dynamic predecessor. The path determination algorithm treats the source information as before (see steps 2B1f1 and 2B1f2). Statically, the object code contains one more instruction than the fully optimized version (the null instruction). Both execution paths through the merged region would contain one more instruction than they would otherwise have: the null instruction. This change slows the program down slightly, but since null instructions are usually small, fast, and do not cause pipe turbulence, this solution seems acceptable. 6.3.2.4 Fine points of the debugger algorithms A single sentry instruction can assign multiple determiners, and multiple sentry instructions can assign a single determiner. The resulting determining breakpoints and determination cells can be organized either by object code location or by path determiner. Navigator organizes them by object code location: a single determining breakpoint is placed on any path determining instruction. A reference count indicates how many determiners the determining breakpoint is currently collecting information for; the reference count is used to ascertain when the determining breakpoint can be removed (which is when all of the primary breakpoints that use its history information are removed). To find a path determiner's current timestamp, the debugger searches the path determiner table to find all object locations corresponding to that determiner. It then finds the determination cell for each such object location (for the current procedure invocationexplained below in Section 6.4.3). The timestamp for that determiner is the latest of the timestamps recorded in these determination cells. An organization by determiner would make determining breakpoints slower and primary breakpoints faster. Each determiner would have one determination cell (per procedure invocation). When a single instruction is a sentry for more than one merging sequence (that is, if a single instruction assigns multiple determiners), this new organization could use multiple distinct determining breakpoints, or it could use a single determining breakpoint that can set multiple determination cells. In this organization, care must be taken that two logically distinct determining breakpoints at the same location receive the same timestamp, because that is how the debugger detects ambiguity. 6.3.3 Fundamental difficulties with the path determination algorithm Several complications can arise from the cross-jumping algorithm. These complications fall into three categories, which are, in order of decreasing seriousness: 9 The problem causes incorrect debugger behavior. 9 The problem allows only truthful, rather than expected, debugger behavior. 9 The problem causes inefficient compiler and/or debugger behavior. The problems in the first category must be dealt with in some waypossibly by converting them into problems of the second or third type. The problems in the second category can be ignored if necessary to reduce complexity or avoid inefficiency. The problems in the third category must be considered seriously if the resulting system is to be of practical use, but can be ignored in some instances to reduce complexity. When a problem causes incorrect or truthful behavior, there are several general solution techniques to consider. Insert a null instruction. If the problem can be identified during the application of cross-jumping, the compiler could insert an extra null instruction. This extra instruction can hold some necessary source information, provide a place that corresponds to a user-visible location, or distinguish one execution path from another. This change eliminates the problem, but the resulting program is slightly larger (one more instruction) and slower than the fully optimized version. However, null instructions are small and fast on most machines. The benefit of this solution over the others depends upon the size of the merging regions. The tradeoff of decreasing the program's speed slightly may be acceptable for fairly large regions; it is definitely not appropriate when the size of the region equals the size of the null instruction. Limit cross-jumping slightly. If the problem can be identified reasonably narrowly during the application of cross-jumping, the compiler can choose to inhibit cross-jumping for that region (or possibly just to decrease the size of that region). "Reasonably narrowly" means that the condition used to identify the problem includes only a small number of cases that do not have the problem. (It must include all cases that do have the problem.) This solution eliminates the problem, but results in a less-optimized program than that produced by the original Cedar compiler. Since the cross-jumping optimization saves only program space, the resulting program will be larger but will execute at the same speed (ignoring secondary size effects, such as paging). Do nothing. This technique is allowable only when the problem leads truthful rather than incorrect behavior. In addition, the problem should either arise infrequently or be expensive to fix. It does not solve the problem. Two fundamental difficulties with the path determination algorithm arise at compile-time. Some source programs are essentially ambiguous with respect to path determination: two entire joining paths generate the same object code, causing the two opposing determiner assignments to collide at some sentry of the merged region. If no special compensation is made, the debugger can give only truthful behavior, rather than expected behavior. In other cases, a sentry of a merged region can be outside the procedure containing the merged region; that is, an instruction in another procedure branches directly to a label at the top of a merged region. This second situation arises as a result of Cedar's exception-handling mechanism. It is crucial to do something about this second compile-time problemotherwise the debugger will give incorrect behavior for some computations. Difficulties also arise at runtime. Two paths that contain a directly or indirectly recursive procedure call can be cross-jumped; also, multiple processes can execute a cross-jumped procedure. Of course, in order to record the flow of control through each concurrently active invocation of the procedure, there must be a separate timestamp for each such invocation. A second problem concerns the debugger's runtime storage efficiency: where and when should determination cells be allocated (and de-allocated)? 6.3.3.1 Compile-time problem 1: essential ambiguity (path determiner collision) If an entire path generates the same object code as some other joining path, the same instruction will assign both a path determiner and its opposing path determiner. Figure 6-8 illustrates this situation. Such programs may not occur frequently in practice, but if they occur, the path determination algorithm can at least provide truthful behavior by correctly identifying the source alternatives. Actually, repeated optimizations must occur for this situation to arise, because the cross-jumping algorithm always replaces the extra path by a jump instruction, which can then be used for disambiguation. See Section 6.5.2.5 for a description of the steps needed to create this problem. source program fragment after cross-jumping 1 b _ 5; b _ 5; pd1, pd2 2 IF cond THEN 3 a _ 1 ELSE 4 a _ 1; a _ 1; source: 31, 42 Figure 6-8. Program fragment with undeterminable paths. In fact, the ambiguity may not be apparent in the source program. The two constants "1" in Figure 6-8 might have been expressed as different constant identifiers or elements of enumerated types that happened to have the same underlying numeric representation. Or statement 3 might have been a call to an inline procedure that happened to expand to "a_1". Just above we suggested three possible ways to handle this ambiguity: adding a null instruction, inhibiting the optimization enough to leave at least one instruction out of the merge, or doing nothing special. The first two preserve expected behavior, while the last preserves only truthful behavior. The Navigator compiler adds the null instruction as a placeholder for the path determiner. Different code generation templates can also lead to ambiguity. If a SELECT statement is implemented as an indexed jump into a table of jump destinations, path determiner assignments for two identical cases will collide at the indexed jump (recall that a path determiner assignment, because it is implemented as a determining breakpoint, must be an executable instruction). But the same source code will not cause path determiner collision if the compiler implements the SELECT statement as an indexed jump into a table of jump instructions or as a series of tests and jumps. Figure 6-9 illustrates this distinction. Of course, a special runtime mechanism could be designed to monitor indexed jumps (and also conditional jumps) so that different determiners could be assigned on each path away from the jump. The debugger could achieve this result reasonably easily by interpreting complex jumps that are path determining instructions. fig66a.press leftMargin: 2 in, topMargin: 1.5 in, width: 5.3125 in, height: 3.9375 in Figure 6-9. Two different code generation templates for SELECT statements. 6.3.3.2 Compile-time problem 2: externally-accessible labels The Cedar language allows a statement to signal an exception, which will be caught by a dynamically-scoped catch phrase. The catch phrase may be in a different procedure or a different module than the signaller. In effect, this mechanism allows statements in one procedure to branch indirectly to certain labels inside other procedures. These branches pass through the runtime exception handler, which searches backward through the procedure call chain for blocks or procedure calls that have enabled a catch phrase for that exception. Internal labels that can be accessed from outside their procedure are not unusual in Algol-like languages. For example, Pascal allows a nested procedure to contain a branch to a label in a containing procedure [Jensen+74]. The SAIL language allows any label to be declared INTERNAL [Reiser76], allowing it to be accessed from externally compiled programs. Externally-accessible labels cause problems for cross-jumping. If an externally-accessible label's instruction is merged (and hence the labelled instruction is an entrance of a merged region), the proper operation of the path determination algorithm requires that a path determiner assignment be placed at each immediate dynamic predecessor of the label. But in this case, the predecessor instructions are outside the current procedure and may even be outside the current module. [A similar problem might arise if the first instruction in a procedure were inside a merged region. This cannot occur in many implementations, including Cedar, because procedure prologue instructions are unique.] The path determination algorithm cannot place path determiner assignments at these instructions for three reasons. First, the algorithm deals only with the code stream for a single procedureinstructions in other procedures are not available. Second, deciding which instructions might be immediate dynamic predecessors can be difficult. For example, any SIGNALX instruction that can execute in the dynamic scope of the CATCHX is potentially an immediate dynamic predecessor. Third, when a SIGNALX is in another module, its association with the CATCHX may not be known until runtime. One solution to this difficulty is to insert a new null instruction and a new label between the externally-accessible label and the top instruction in the corresponding merging sequence. After the insertion, the new label denotes the top instruction in the merging sequence, and the externally-accessible label denotes the new (null) instruction. Therefore the sentry instruction for the merging sequence is now the new instruction rather than an external instruction. [Note: it is crucial that later optimizations not remove this added null instruction; see Section 6.5.2.] An arguably better approach is simply to inhibit the merging algorithm enough to omit from any candidate sequence the top instruction reached by a jump to an externally-accessible label. Both of these solutions reduce the effect of the optimization, but they leave the runtime path determination algorithm unchanged (others that have been proposed would require additional effort and provide at best truthful behavior). These situations are expected to occur rarely enough that a search for better solutions seems fruitless. 6.3.3.3 Runtime problem 1: multiple processes and/or recursion Two paths that contain a directly or indirectly recursive procedure call can be cross-jumped. In order to match timestamps and execution paths properly, a single determining breakpoint must record timestamps in one of several determination cells, each containing a timestamp for one invocation of the procedure. This solution also takes care of multiple processes executing a cross-jumped procedure. The mechanism for associating determination cells and procedure invocations is described in the following paragraphs. 6.3.3.4 Runtime problem 2: allocation and de-allocation of determination cells The choice of where to allocate determination cells affects the efficiency of the debugger. If the compiler were to allocate any necessary determination cells in the procedure's local frame, the association between a determination cell and a particular procedure invocation would be implicit. However, this strategy would be quite expensive, as runtime space would be consumed even for procedures in which no breakpoints had been set. Furthermore, recursive procedures or multiple processes would consume multiple copies of this unused space. It is likely that this extra data space would more than offset the space saved by the cross-jumping optimization. Instead, when a breakpoint is placed in a merged region, the debugger allocates determination cells in a debugger-managed history pool. The timestamp value in each determination cell is associated with a single procedure invocation by having a pointer to the activation frame for that procedure invocation. Similarly, the choice of when to allocate and de-allocate determination cells also affects the efficiency of the debugger. The Navigator debugger allocates a determination cell at the determining breakpoint if there is not already a cell for the current procedure invocation (i.e., with a pointer to the current activation frame). All determination cells for a determining breakpoint are de-allocated when the determining breakpoint is removed. In addition, the Navigator debugger garbage-collects determination cells corresponding to de-activated frames at a determining breakpoint if there is not sufficient space in its history pool. Unfortunately, the checks for allocation and de-allocation add complexity to the determining breakpoint processing. A fine point: the Cedar runtime environment allocates activation frames on a heap, from a pool of frames of the proper size. Therefore, a procedure could: be called (allocating a frame), set some determination cells associated with that frame, return, be called again, and be assigned the same activation frame. This means that old determination cells with valid but obsolete timestamps would be reactivated. The timestamps for the new invocation would always be later than those for the old invocation, so this scenario could cause trouble only if the region were entered via a permanently-true determiner (see Section 6.4.3, above) in the new invocation. In that case, the non-catch phrase source alternative would be the extra one, rather than vice versa (a small problem for advanced user documentation). It would be better all around if each procedure invocation had a truly unique identifier that could be used instead of the activation frame pointer. If procedures are long or contain loops (so that procedure entry and exit happen much less frequently than determining breakpoints), it would be profitable to allocate determination cells corresponding to activated determiners at procedure entry and de-allocate them at procedure exit. This could be done as part of the procedure entry and exit instructions (the checking for which would slow down all procedure calls), or by means of extra invisible breakpoints at procedure entry and exit, activated by breakpoint requests as before. For safety, such a scheme must allocate and de-allocate properly in the presence of exception-raising (exits from the middle of a procedure) and exception-handling (externally-accessible labels, or entries to the middle of a procedure). All of this has one major ramification for timestamp-checking: an activated determining breakpoint may not have a corresponding determination cell when the primary breakpoint is reached. The timestamp value for that determining breakpoint is considered to be time zero. 6.3.4 Complete cross-jumping and path determination algorithm (repeated merging) If cross-jumped regions were never considered for further application of the cross-jumping algorithm, the path determination procedure described in the previous section would be adequate. However, not only can a single pass through the generated object code create multiply-merged regions, but the compiler applies the cross-jumping algorithm and other object code optimizations to the code stream repeatedly, until no changes occur. To keep track of the effects of these iterated optimizations, some additional static and dynamic information is needed. 6.3.4.1 Summary of the complete cross-jumping and path determination algorithm The problems arise when a candidate sequence for a new merge (either the to-delete sequence or the to-remain sequence) contains either a portion of a previously merged region or a sentry of a previously merged region. Given a path determiner d, we refer to the determiner of its opposing sequence as d. If a candidate sequence contains a portion of a previously merged region, the candidate sequence contains at least one instruction cell that already has more than one source descriptor. Each source descriptor has a determiner use for a previously merged sequence's path determiner in its pdToUse field. The complete algorithm records the instruction cell's use of the new path determiner (for the candidate sequence) by allowing a single source descriptor to have multiple pdToUse fields. That is, each source descriptor from the candidate sequence uses the candidate path determiner in addition to its previous path determiners. As before, the compiler merges the candidate sequence sourcelist with the sourcelist from the corresponding instruction cell in the opposing candidate sequence. Figure 6-10 shows an instance of repeated merging. In the flowgraph form of the source fragment, the instructions are shortened to their first identifier. The numbers at the lower right of an instruction letter refer to its source information: path determiner uses appear here as subscripts. An instruction's path determiner assignments appear at its upper left. For example, after the second merge, source descriptor 15 uses path determiners 1 and 4. The debugger interprets this information as follows: path determiners 1 and 4 must both have later timestamps than their opposing determiners (that is, path determiners 2 and 3) in order for statement 15 to be executing. All possible execution path traces for the final merged form, along with the path determiner assignments and source alternatives that they imply, are shown at the bottom of the figure. Now consider the case in which a candidate sequence contains a sentry of a previously merged region; that is, the candidate sequence contains an instruction cell with a non-empty pdToAssign list. It is tempting to suppose that the compiler could remove these path determiner assignments from their current location and place a copy of them at each sentry of the candidate sequence. By this action, the compiler would represent the fact that the previously merged code sequences actually merge earlier than they did before. Unfortunately, there are programs for which this strategy fails to provide full path determination. These programs are characterized by multiple control-flow paths that are wholly contained in a merged region. Figures 6-11a and 6-11b show the correct repeated cross-jumping of such a program. To see how the proposed handling of embedded sentries would fail, suppose that the fourth cross-jumping step (in Figure 6-11b) were fig67.press leftMargin: 1.875 in, topMargin: 1 in, width: 5.375 in, height: 8.5 in Figure 6-10. Simple repeated cross-jumping. fig68a.press leftMargin: 1.9375 in, topMargin: 1 in, width: 5.625 in, height: 8.5 in Figure 6-11a. Repeated cross-jumping with embedded control-flow, part 1. fig68b.press leftMargin: 1.625 in, topMargin: 1 in, width: 5.875 in, height: 8.5 in Figure 6-11b. Repeated cross-jumping with embedded control-flow, part 2. to move determiner assignments 1, 2, and 5 up to the location of determiner assignment 7, and similarly move determiner assignments 3, 4, and 6 up to the location of determiner assignment 8. It would then be impossible for the debugger to distinguish between the execution of statements 25 and 27, or between statements 34 and 36. Instead, the instruction's sentry information is made to depend upon the candidate path determiner, in much the same way as a source descriptor is made to depend upon the path determiner for the sequence that contains it. To record the fact that a path determiner assignment depends on the value of the new candidate determiner, each of the instruction's sentry descriptors is given an indirect determiner use for the candidate path determiner. As a result, the debugger will collect timestamp values for the two opposing determiners at runtime, and will check these values to determine on behalf of which of the two sentry descriptors the instruction is executing. Finally, if the candidate sequence is the to-delete sequence of this merge, the algorithm adds the updated sentry information to the current cell in the to-remain sequence. To see that this algorithm works, suppose that an object location x has a possible path determiner assignment d3 whose pdToUse list contains (only) d1. At runtime, the execution of x corresponds to a sentry d3 of an inner merged region if and only if the outer merged region was entered via path determiner d1 more recently than it was entered via path determiner d1. Dealing with the added complexity of repeated cross-jumping also requires some runtime modifications. The following path determiners are activated as a result of a breakpoint request: first, all path determiners that are used by any source alternative for the breakpoint's object location, and second, (recursively) for any path determiner d that is indirectly used by a determiner activated in step 1, both d and d. Timestamp checking at a primary breakpoint must also be altered slightly: a list of determiner uses of the form (d1, d2) on either a single source descriptor or a single sentry descriptor is satisfied if and only if latest timestamp (d1) > latest timestamp (d1) and latest timestamp (d2) > latest timestamp (d2). Ascertaining the latest timestamp value for a single path determiner d at a primary breakpoint also reflects determiner nesting: a timestamp value in a determination cell is valid only if the fig69.press leftMargin: 1.8125 in, topMargin: 1 in, width: 5.25 in, height: 8.5 in Figure 6-12. Repeated cross-jumping of Figure 6-10 with a different cross-jumping ordering. latest timestamp values of the determiners that d uses indirectly are not earlier than the latest timestamp values of their opposing determiners. These timestamp-checking algorithms thus need to be able to find the opposing determiner for any determiner. A simple scheme, which works as long as cross-jumping introduces exactly two determiners for each merge, is to use an even/odd distinction. Notice that the order of consideration of joining paths for cross-jumping can affect the final placement and number of path determiners. Figure 6-12 shows the same program as Figure 6-10, but a different cross-jumping ordering has caused more determiners and more complex runtime behavior. (Figure 6-12 has indirect determiner uses, while the earlier figure did not need them.) We now show the full cross-jumping and path determination algorithm. Changes from the original Cedar algorithm are italicized; changes from Algorithm NCJ1 are italicized and underlined. 6.3.4.2 Algorithm NCJ2: Complete cross-jumping and path determination algorithm Data structures: An instruction cell's source information is expanded from a single source descriptor (with fields stmtNum and isStart) to a list of source descriptors (via a sourceList field). Each source descriptor is now permitted to have a list of path determiner uses rather than just one path determiner use. Each instruction cell also gets a new list of sentry descriptors (pdToAssignList). Each sentry descriptor has one path determiner assignment (pdToAssign) and a list of indirect path determiner use (pdToUseList). Define a cell match to occur between two instruction cells whenever both cells 1) have the same opcode and operand fields, 2) are not unconditional jumps, and 3) are not backward conditional jumps. Two forward conditional jumps are equal if they have the same condition, the same operand(s), and the same destination label. A label cell does not match any other cell. 1. Step forward linearly through the code stream of the procedure. 2. Whenever a label cell is encountered: 2A. Find the bottom cell in each unconditional path to the label. 2B. For every pair of paths to the label, neither of which was previously a to-delete path: 2B1. If the two bottom cells match, BEGIN Cross-jump 2B1a. Designate one path as the to-remain path (to avoid introducing extra jumps, the fall-through path is always designated as the to-remain path), and designate the other path as the to-delete path. 2B1b. Assign a new path determiner identifier to each of the two paths. 2B1c. Delete the joining jump (i.e., the link cell on the to-delete path). 2B1c1. Keep the joining jump's source and sentry information: 2B1c1a. If the to-remain path is a fall-through path: 2B1c1a1. If the isStart field is TRUE for any source alternative of the joining jump, insert a null instruction immediately preceding the link cell on the to-remain path. Copy source information for the new null instruction from the bottom cell in the to-remain path. Do steps 2B1e1 and 2B1e2 with the to-delete current cell = the joining jump and the to-remain current cell = the new null instruction. 2B1c1a2. Otherwise, throw the joining jump's source information away. Unless it would create ambiguity, move the joining jump's path determiner assignments backward to its immediate static predecessor (see Section 6.4.4.3, below). 2B1c1b. If the to-remain path is a jump-in path, do steps 2B1e1 and 2B1e2 with the to-delete current cell = the joining jump and the to-remain current cell = the to-remain link cell. 2B1d. On each path, set the current cell pointer to the bottom cell. 2B1e. As long as the current cells on the two paths match (note: all cells in each merging sequence are in the same extended basic block), BEGIN Merge one instruction 2B1e1. [Keep source information.] On each path, add a determiner use for that path's determiner to each source descriptor for the current cell. Add the to-delete cell's source information to the to-remain cell's source information. There may be multiple source descriptors in each list. 2B1e2. [Keep sentry information.] On each path, add an indirect determiner use for that path's determiner to each sentry descriptor for the current cell. (Either or both cells may lack sentry information.) Add the to-delete cell's sentry information to the to-remain cell's sentry information. 2B1e3. Delete the current cell on the to-delete path from the code stream. 2B1e4. On each path, move the current cell pointer backward in the code stream to its immediate static predecessor cell. END Merge one instruction 2B1f. Mark the sentry (or sentries) for the to-remain path as follows: 2B1f1. Find all immediate dynamic predecessors of the top merged cell. (Note: the top merged cell is on the to-remain path; it has not yet been connected to the to-delete path.) 2B1f2. Add a determiner assignment for the to-remain path determiner to the pdToAssignList of each such cell. Each new sentry descriptor has no indirect determiner uses. 2B1g. Make the top merged cell jump-addressable by inserting a label cell before it if there isn't already one there. Call the label cell the crossing label. 2B1h. Insert the crossing jump (an unconditional jump cell to the crossing label) after the current cell on the to-delete path. 2B1h1. Create source information for the crossing jump. If its immediate static predecessor is not an unconditional jump, copy its source information. Otherwise, copy the source information from the to-delete path of the top merged cell. Do not mark it as the start of a source statement. 2B1i. Mark the sentry for the to-delete path as follows: 2B1i1. Add a determiner assignment for the to-delete path determiner to the crossing jump's pdToAssignList. The new sentry descriptor has no indirect determiner uses. END Cross-jump 6.3.4.3 Step 2B1c1a2: Keeping the joining jump's sentry information As in the case of keeping the joining jump's source information [step 2B1c1a1; see Section 6.4.2.4], keeping the joining jump's sentry information is only a problem when the joining jump has no corresponding instruction on the to-remain paththat is, when the to-remain path is a fall-through path. These two problems are similar, but not identical. The joining jump's source information must be kept only when the joining jump represents the start of execution of a source statement, so as to permit a statement breakpoint there. Furthermore, since a statement breakpoint is visible to the user, the new instruction holding the source information must execute in the proper order with respect to its surrounding instructions: that is, after the bottom instruction of the merged sequence and before the instruction following the joining label. In contrast, the joining jump's sentry information must always be kept, so as to properly distinguish among merged instructions that execute after it. Since a determining breakpoint is invisible to the user, there is a bit more flexibility in its placement, but we would like it to fall on the new sentry of the previously-merged region. [Recall that the sentry instruction for a merged region is always an immediate dynamic predecessor of the region.] If the joining jump is a sentry for a previously-merged region, the joining jump is the immediate dynamic predecessor of a previously-merged instruction. Since the joining jump is an unconditional jump, that merged instruction can only be the instruction following the joining label. So the new sentry instruction that will take the joining jump's place is the new immediate dynamic predecessor of that merged instruction after the current merge. If step 2B1c1a1 inserted a null instruction to hold the joining jump's source information, the new immediate dynamic predecessor is that null instruction when it executes on behalf of the to-delete sequencethat is, qualified by the to-delete path determiner. If no null instruction was inserted (and the source information was discarded), the new immediate dynamic predecessor is the bottom instruction of the new merge, qualified by the to-delete path determiner. Moving the joining jump's path determiner assignments backward in this way could include more execution flow paths to the previously-merged region than the joining jump did originally. Because the bottom instruction of the to-delete sequence is a matching instruction and is therefore neither a label nor an unconditional jump, that bottom instruction is guaranteed to be the single immediate dynamic predecessor of the joining jump. However, under unusual circumstances, the bottom instruction might be a conditional jump into the same previously-merged region, in which case the bottom instruction would be marked with the opposing determiner of one that marks the fig60a.press leftMargin: 1.625 in, topMargin: .9375 in, width: 5.6875 in, height: 8.5 in Figure 6-13. Keeping the joining jump's sentry information. joining jump. Figure 6-13 illustrates this situation. In this unusual case, moving the joining jump's path determiners back would create ambiguity: when control comes through the bottom instruction, if the to-delete determiner of the new merge is true, then both the to-delete and the to-remain determiners of the previously-merged region are true. To avoid this ambiguity, solutions similar to the proposed solutions for the joining jump's source information can be used. To retain expected behavior: 1. Insert a null instruction on the to-remain path in this case also. This would make the program slightly slower, but hopefully smaller. 2. Abort the cross-jump of the new region. The resulting program would be larger than the fully optimized version, but the same speed as the unoptimized version. To fall back to truthful behavior: 3. Continue with cross-jumping, moving the joining jump's path determiners back to mark the bottom instruction in the to-delete path. The Navigator compiler uses solution 1 because it matches the chosen solution for the joining jump's source information. Would these unusual circumstances ever arise? It seems very unlikely. The limited form of GOTO statements in Cedar ensures that the source version of a Cedar program can never pose this problem. But it is quite difficult to predict the effects of repeated applications of cross-jumping and other object code optimizations. Because I cannot prove that situations like this do not arise from repeated optimization, the Navigator compiler must be prepared to handle them correctly. 6.3.5 Efficiency: separate sourcelink cells Since the compiler usually generates several object code instructions for each source statement, the abstract code stream is actually organized a bit differently than I have previously indicated. Rather than embedding source information in each instruction cell, the code generator places this information in separate sourcelink cells; a sourcelink cell becomes the first cell in each statement. Therefore every code stream cell between two sourcelink cells has the relationship to the source text that is described by the statically preceding sourcelink cell. No object code is emitted for a sourcelink cell. This organization saves space but adds processing complexity when compared with marking each instruction. The structure of a sourcelink cell mirrors the structure of the source information shown in Figure 6-7. At the start of the object code optimization phase, all sourcelink cells have a single source descriptor in their sourcelists, with isStart true. Because the source table is concerned only with the object location for the beginning of a source statement (for setting breakpoints), retaining information for later construction of the source table is straightforward. When the compiler encounters a sourcelink cell in the to-delete code sequence during the backward scan, it need only move the sourcelink cell to the current position in the to-remain sequence (sourcelists of resulting adjacent sourcelink cells are merged). However, retaining information for the object table (for reporting the current execution location) is more complicated, for two reasons. First, source statements need not change at the same point on two merging paths. For example, if f1 and f2 are integer fields in a record rec, and a and b are integers, the two statements a _ rec.f1; b _ rec.f2 generate the same code as the single statement [f1:a, f2:b] _ rec. Second, cross-jumping may merge only a portion of a statement. A typical example of this case is merging a path that ends with c _ 1 with a path that ends with c _ 2. If bounds-checking is enabled, only the check and the store to c are merged. If the check fails, either statement could be executing. The compiler retains sufficient information in the sourcelink cells to create both the source table and the object table. The following cases specify the bookkeeping necessary when the backward scan encounters a sourcelink cell in either code sequence. 1. If a sourcelink cell is encountered in the to-remain code sequence, the to-delete code sequence is searched for the nearest sourcelink cell preceding the current point. The sourcelist from the to-delete code sequence is appended to the sourcelist from the to-remain code sequence. If the search crossed an instruction cell, the isStart field of each element of the to-delete code sequence sourcelist is set to false as it is appended. 2. If a sourcelink cell is encountered in the to-delete code sequence only, the to-remain code sequence is searched for the nearest preceding sourcelink cell. That cell is copied to the current point in the to-remain code sequence, setting isStart to false for each element of the sourcelist. The sourcelist from the to-delete code sequence is then appended to that sourcelist. To reflect the full relationship between the source text and the newly merged object code, there must be sourcelink cells that correctly describe the first piece of object code in the new merged sequence (sourcelink cell expansion), the first piece of object code following the to-remain sequence (sourcelink cell contraction), and the first piece of object code following the to-delete sequence (sourcelink cell completion). If these locations do not already have sourcelink cells, appropriate sourcelink cells must be created. Similarly, since only a few instructions in a code stream become path determining instructions, the sentry information is represented as separate sentry cells, thereby reducing the size of most code stream cells by one field. In general, separate sentry cells are not much harder to handle than embedded sentry information. However, later object code optimizations must take care to treat a separate sentry cell and its associated instruction as a single unit. To make things easier to understand, we now return to the assumption that source and sentry information are embedded in each code stream cell. 6.4 Interactions between optimizations This section describes how Navigator ensures that the debugging solution for one optimization is not destroyed by a later optimization. Inline procedure expansion occurs first. Then there is an object code optimization loop in which a few object code optimizations, followed by cross-jumping, are applied repeatedly. Finally, a few more object code optimizations are applied. 6.4.1 Cross-jumping after inline procedure expansion Because inline procedure expansion occurs earlier in the compilation process than the object code optimizations, an expanded inline procedure can be cross-jumped with statements outside that inline procedure (but inside the calling procedure). Because the algorithms were designed with these two optimizations in mind, no additional mechanisms beyond those described in Algorithms NI1 and NCJ2 are needed to handle this situation. The debugger can easily set and field breakpoints and report the current execution location. However, the possibility that a program region can be copied and then merged constrained the compile-time solution for providing a procedure traceback: each source descriptor must have a separate inline table pointer, since cross-jumping may move that statement away from the other statements that came from the same expansion. Figure 6-14 shows an example of inline procedure expansion followed by cross-jumping. fig610.press leftMargin: 2.125 in, topMargin: 0.90625 in, width: 5.3125 in, height: 8.5 in Figure 6-14. Cross-jumping after inline procedure expansion. 6.4.2 Other object code optimizations after cross-jumping A more serious problem is that other control-flow optimizations, such as the Cedar compiler's peephole optimizations, can occur after cross-jumping. One of these later optimizations could alter the code stream so that path determiner assignments no longer mark each of the ways to enter the merged region. This section explains how Navigator ensures correct determiner placement in the face of later optimization: lacking a general method for updating path determiner assignments, we must ascertain how each optimization affects the assignment of path determiners, then either reassign them or inhibit the optimizations that make reassignment impossible. 6.4.2.1 Why path determiners must be updated Recall that a path determiner assignment is intended to identify a given execution path at runtime, so that when execution reaches a merged region, the path that was traversed to get there can be used to distinguish among the source alternatives. A path determiner assignment appears on a particular instruction only as a placeholder on that execution path; path determiner assignments always mark executable instructions because a path determining instruction is replaced by a trap instruction for the determining breakpoint. The cross-jumping algorithm locates all of the sentries, or path determining instructions, of two sequences that are being merged. When such an instruction is moved or deleted, or when instructions are inserted immediately following a path determining instruction, other instructions become sentries of the merged region. Navigator identifies these new sentries by considering the local execution flow properties of the old path determining instruction, rather than by examining new dynamically following merged region(s) for unmarked sentries. 6.4.2.2 Navigator peephole optimizations The peephole optimization phase improves a single procedure's object code in two stages. The first stage repeats until the opportunities for its optimizations are exhausted. First stage optimizations, in the order of their application, are: 1. Delete jumps to the next instruction, unreferenced labels, and multiple consecutive labels. 2. Redirect unconditional jumps whose destinations are also unconditional jumps to remove intermediate hops; this process is known as branch chaining. 3. Delete unreachable code. All instructions that lie between an unconditional jump and the statically following label are unreachable. 4. Replace a conditional jump around an unconditional jump by an opposite-sense conditional jump; this process is called condition reversal. 5. Perform cross-jumping. In the second stage, a collection of optimizations that affect instructions within a single basic block is applied once [although the finished code might profit from repeating these steps as well]. In these optimizations, statically adjacent instructions are examined for opportunities to delete instructions or to combine them into a more powerful single instruction [Lamb80]. 6.4.2.3 Primitive dynamic transformations The dynamic position of an instruction is determined by the order in which the instruction executes, not by its physical placement in the linear code stream. Another way of stating the present problem is that any dynamic transformation that alters the dynamic position of a path determination instruction can affect the correct placement of the path determiner assignment. For example, it makes no difference to the placement of path determiner assignments whether an instruction is deleted from an execution path by physically removing the instruction from the code stream, or by rerouting instructions (that is, altering jump destinations) so that the execution path no longer passes through that instruction. The primitive dynamic transformations are deletion, insertion, and replacementin other words, generalizations of the corresponding primitive static transformations. Dynamic deletion is the removal of an instruction from a runtime execution path. Several static transformations cause dynamic deletion: deleting an instruction from the code stream, bypassing an instruction, or moving an instruction from one location in the code stream to another. An instruction x is bypassed when an execution path that passes through x is rerouted so that it no longer passes through x. Dynamic insertion and replacement are similarly defined. In order to discuss the dynamic positioning of instructions in the linear code stream, the following terminology is useful: A simple instruction is an instruction that has exactly one dynamic predecessor and exactly one dynamic successor. A collector instruction is an instruction that has more than one dynamic predecessor, such as the instruction following a label that is the destination of multiple jumps. A selector instruction is an instruction that has more than one dynamic successor, such as a conditional jump or an indexed jump. A collector-selector instruction has both attributes. 6.4.2.4 Effects of dynamic transformations on path determiner placement Dynamic deletion Suppose that there is an execution path containing instructions x;y;z, where z is an entrance of a merged region R. Then y is a sentry of R that assigns path determiner d1. If a dynamic deletion removes y from that execution path (giving execution path x;z), the new sentry instruction of z along this path is x. fig611.press leftMargin: 2.125 in, topMargin: 2 in, width: 5 in, height: 2 in Figure 6-15. Bypassing: path determiner movement for dynamic deletion. If y is physically deleted, it is removed from all of its execution paths; in this case there may be multiple new sentries, corresponding to all of the dynamic predecessors of y. To update the sentry relationships when y is deleted from an execution path, each such new sentry instruction gets an additional determiner assignment for d1. fig612.press leftMargin: 2.125 in, topMargin: 1.9375 in, width: 5.0625 in, height: 2 in Figure 6-16. Path determiner movement for physical code deletion. The following discussion analyzes the effects that the dynamic deletion of y has on the correctness of runtime path determination. Case 1. If no new sentry is a selector instruction, the new sentries represent exactly the same execution paths as y does (that is, some new sentry executes if and only if y does). In this case the deletion does not alter the sentry information. It is possible that some new sentry si already assigns a different determiner d2. This situation can arise because another merged region R( directly precedes R, as shown in Figure 6-17, or because three or more paths join to form R, as illustrated in Figure 6-18. Even though si assigns both d1 and d2 after the deletion of y, these two determiners do not conflict and path determination still works properly. Case 2. If some new sentry sk of y is a selector instruction, the new sentries represent different execution paths than y does, because execution can pass from sk to some instruction other than y. Normally this situation does not create difficulties, but if two further conditions hold, deleting y from that execution path can introduce ambiguity: (1) sk is already a sentry of R that assigns a different determiner d3, and (2) some instruction in R has one source descriptor that uses d1 (and possibly some other group D of path determiners), and a different source descriptor that uses d3 (and D). That is, d1 and d3 are the only differences between the determiner uses of two source descriptors for some instruction in R. In this case, deleting y will make it impossible to distinguish between those two source alternatives whenever R is entered via sk. If the system designers are willing to sacrifice a small amount of optimization for better information, y should not be deleted when these conditions hold (this is the approach that Navigator takes). If an instruction that is to be deleted has multiple determiner assignments, all of its determiner assignments are added to the sentry information of each new sentry. When several dynamically consecutive instructions are deleted in one step, there are two cases. If the deletion is only dynamicthat is, it is a bypassing transformationthen by definition there can be only one new sentry along that execution path: it is the dynamic predecessor of the first bypassed instruction. For physical deletion, a sequence of instructions can be treated as a unit as long as it has a single entrance. Otherwise, it must first be partitioned into sections each of which has a unique entrance. fig614a.press leftMargin: 2.5 in, topMargin: .5 in, width: 3.5625 in, height: 8.5 in Figure 6-17. Merged regions can abut. fig614b.press leftMargin: 2.5 in, topMargin: .5625 in, width: 3.5625 in, height: 8.5 in Figure 6-18. Merged regions can overlap. Dynamic insertion To ensure that no execution path can sneak into a merged region wthout passing a determiner assignment, a dynamically inserted instruction x must copy or move all of its predecessors' determiner assignments down to it. However, when the insertion is between the top instruction of a merged region and its preceding label, the inserted instruction is logically part of the merged region and receives no determiner assignments for that region. Dynamic replacement If a path determining instruction x is replaced in an execution path by a new instruction y, x's determiner assignments can be moved to y as long as y has the same dynamic successors in the same semantic situations as x. An example of a replacement instruction that has the same dynamic successors in different (in fact, opposite) semantic situations is JumpIfEqual L replaced by JumpIfNotEqual L. If a replacement causes a change in the successor relationships, it can be decomposed into its constituent deletion and insertion. 6.4.2.5 Analysis and examples of Navigator peephole optimizations First stage optimizations The optimization that removes jumps to the following instruction acts as either a dynamic deletion or a dynamic replacement. An unconditional jump to the following instruction is physically deleted. A conditional jump to the following instruction is replaced by a POP, which removes the conditional jump's operand from the stack (recall that the Cedar virtual machine is a stack machine). This optimization performs the final step in creating an ambiguous region when both parts of an IF-THEN-ELSE statement are identical. As suggested earlier, this step could be inhibited in order to provide expected behavior rather than truthful behavior. Figure 6-19 shows the steps in the creation of the ambiguity. Branch chaining removes intermediate hops in chains of unconditional jumps by redirecting the first jump so that its new destination is the first instruction in the chain of destinations that is not an unconditional jump. The branch chaining optimization acts as a dynamic deletion. Unreachable code deletion also acts as a dynamic deletion, but since the code to be deleted has no dynamic predecessors (by definition of unreachable code), the instructions and their determiner assignments can be deleted with no further processing. source fragment after cross-jumping 10 Const1: INT = 7; 11 Const1: INT = 7; 9 9 9 60 b _ 5; b _ 5; 61 IF cond THEN IF cond THEN pd1 62 a _ Const1 pd2 63 ELSE ELSE 62 a _ Const2; a _ 7; 621 642 after deleting jump to next instruction after deleting unused test b _ 5; b _ 5; pd1, pd2 IF cond THEN pd1, pd2 a _ 7; 621 642 a _ 7; 621 642 Figure 6-19. Steps in the creation of essential ambiguity. The optimization, called condition reversal, that replaces a conditional jump around an unconditional jump by an opposite-sense conditional jump, acts as a dynamic deletion of the unconditional jump. Repeated cross-jumping is described in Section 6.4.4. Second stage optimizations: intra-basic block pattern-matching The second stage optimizations are pattern-matching optimizations in which fixed sequences of two to five instructions, all in the same basic block, are deleted or replaced by faster sequences of instructions. In some cases the new sequence of instructions is longer than the original sequence. These optimizations act as dynamic deletions, dynamic insertions, or dynamic replacements. For simplicity, the current Navigator implementation inhibits these optimizations whenever any instruction in a pattern has determiner assignments. Because no control flow is altered, and no sequence is completely deleted, these changes cannot cause problems. Therefore, it would be reasonably straightforward to funnel all of the changes through a single routine that would push all determiner assignments in a sequence to be replaced to the bottom of the new version of the sequence. Extensions: other peephole optimizations A more extensive array of peephole optimizations than those implemented in the Cedar compiler is possible. For example, the Bliss-11 compiler includes peephole optimizations that can modify instructions that are arbitrarily far apart in the code stream (both statically and dynamically) to combine literal constants as operands, to exploit the PDP-11's auto-increment and auto-decrement instructions, to use faster addressing modes, and to remove redundant stores. For correctly updating the path determiner assignments, these optimizations could be decomposed similarly into dynamic insertions, deletions, and replacements. 6.4.2.6 Improvements to the Navigator path determiner updating scheme We really want to mark an execution point corresponding to a sentry of a merged region, but we need to put the determiner assignment somewhere once the cross-jumping and path determination algorithms identify the spot. At runtime, a determiner assignment must mark an executable instruction so that it can be replaced by a determining breakpoint. The complicated transformations on determiner assignments that are needed to track other peephole optimizations can be viewed as the result of binding sentries to instructions too early, since later optimizations can alter sentry relationships. It might be possible to create region sentry marker cells in the code stream during cross-jumping, then propagate determiner assignments back to all predecessors of each sentry node laterfor example, during code emission. Determiner assignments would mark sentry nodes until code emission. Then what happened outside the region during later optimizations would not affect path determination. 6.5 Extensions to Navigator for secondary debugging functions Navigator provides effective debugging for inline procedure expansion and cross-jumping for the primary location- and history-oriented debugging functions: setting breakpoints at program statements, determining the current execution location, and reporting the procedure traceback. This section describes ways to extend Navigator to provide effective debugging for the secondary debugging functions also. 6.5.1 Inline procedure expansion The secondary debugging functions that are affected by inline procedure expansion are: displaying the values of parameters and variables when inside expanded inline procedures, setting breakpoints at the entry or exit of inline procedures, calling inline procedures from the debugger's interpreter, and single-stepping with coarse granularity. 6.5.1.1 Displaying the values of local variables or parameters of inline procedures As alluded to in Section 6.2.3, step 2B2a of the inline procedure expansion algorithm, which describes the substitution of actual parameters for formal parameters, is an oversimplification. If the evaluation of an actual parameter involves runtime computation, the substitution is performed similarly to an ordinary procedure call: nodes to perform the computation and assign the result to the formal parameter are generated. However, if an actual parameter is simple, such as a compile-time constant expression or a simple variable, it is substituted directly for the formal parameter throughout the expanded procedure (parameter subsumption). Of course, this substitution is subject to a consistency check: the call by name semantics produced by each instance of subsumption must be the same as the ordinary call by value semantics. One alternative would be to disallow parameter subsumption. This would result in a less efficient program. To handle parameter subsumption, Navigator would have to record the value of the compile-time constant or the name of the actual parameter in the symbol table entry for the formal parameter, so that requests to display the formal parameter get the expected response. A problem would arise for debuggers that allow modification of variable values: the user could not change the value of such a parameter at runtime. These considerations are outside the scope of Navigator, which concentrates on location-oriented debugging functions. Local variables present a minor way in which Navigator does not provide expected behavior. Upon request, the Cedar debugger will list all of the local variables and parameters of the current procedure. With respect to the symbol table, an inline procedure becomes an inner block of its enclosing ordinary procedure. Therefore, when the user requests the local variables inside an inline procedure, the debugger reports the variables for the inline procedure, the inline procedure's parameters (which appear as ordinary variables), as well as the variables for the enclosing ordinary procedure (and any enclosing inline procedures). The extra variables could be removed by recording inline procedure information in the symbol table as well as the statement maps (possibly through the same inline table). A request to list the current local variables would only list the contents of those blocks belonging to the inline procedure. 6.5.1.2 Breakpoints at procedure entry and exit The Cedar debugger also allows a user to place breakpoints at the entry or exit of a procedure, by specifying the procedure's name (as opposed to statement breakpoints, which use a statement number and module name). These entry/exit breakpoints are normally implemented by looking in the symbol table entry for the procedure name, where the first and last object code locations for the procedure's code are recorded. The Cedar compiler ensures that all source-level RETURN statements jump to a single procedure exit at the end of the procedure's object code. The code extent information in the symbol table is part of the debugger's mapping from object code location to procedure name for ordinary procedures. If the Navigator debugger is to provide entry/exit breakpoints, there are two possible solutions. The first solution would be to record a list of pairs of first and last object code locations in the symbol table entry for an inline procedure. For an imported inline procedure, each implementation module would keep its own list. Break entry and exit would set breakpoints at all of the indicated object locations. The second and preferable solution would be to record the first and last statement numbers of each inline procedure in its symbol table entry. Imported inline procedures would need no special processing. In this case, break entry and exit use the source table to translate the indicated statement numbers to the corresponding object locations. Either of these two solutions has a potential problem: it is important that later optimizations do not remove source information corresponding to the entry or exit of an inline procedure. It is typically easier to ensure proper behavior of later optimizations if indicators are present in whatever code form will be transformed, so that undesirable optimizations can be inhibited or symbol table information can be updated. One way to handle this problem is to flag an expanded statement number as the first (or last) statement in an inline procedure. Among the Cedar optimizations that could endanger the entry or exit information are unreachable code deletion (entry or exit) and branch chaining (exit). Of course, code reordering optimizations would create a difference between semantic and syntactic mappings for the entry or exit of an inline procedure. [Recall that the Cedar compiler does not perform code reordering optimizations.] 6.5.1.3 Calling an inline procedure from the debugger's interpreter The Cedar compiler does not emit stand-alone object code for an inline procedure. As a result, the Cedar debugger does not permit users to call an inline procedure at runtime. This capability can be provided in several ways: 9 The Navigator compiler could emit stand-alone object code for inline procedures. 9 The Navigator debugger could interpret an inline procedure's abstract syntax tree. 9 The Navigator debugger could use the compiler's code generator interactively to create an executable representation of an inline procedure when necessary. The Navigator compiler could emit stand-alone object code for inline procedures. In two situations, this solution is particularly inefficient: imported inline procedures and one stylistic use of inline procedures. Since definitions modules are not permitted to contain executable code, each program module that uses an imported inline procedure would need its own copy of the object code. Some Cedar programmers use the inline procedure mechanism as a way of structuring their programs without paying an execution penalty: they separate an ordinary procedure into logical parts (parts that are nevertheless specific to this procedure) and declare each part to be an inline procedure. The result is that each of the inline procedures is called exactly once. If the compiler were to create a stand-alone copy of the code for each inline procedure, the code size of a module constructed in this way could double. A better solution is to leave the abstract syntax tree for an inline procedure in the symbol table. (For inline procedures declared in definitions modules, this is the status quo: it is the communication method from definitions module to program module. For inline procedures declared in program modules, the Cedar compiler deletes the tree before writing the object file.) Either the debugger could interpret the abstract syntax tree when a user calls an inline procedure from the debugging interpreter, or it could use the code generator interactively to create an executable representation of an inline procedure when necessary. 6.5.1.4 Single-stepping with coarse granularity The major problem with single-stepping is keeping the granularity correct from the source viewpoint. This means that whenever a user requests single-stepping with coarse granularity inside a procedure P, the debugger must not report the execution of any statement in an inline procedure called from P or its callees. Single-stepping with fine granularity is not affected. 6.5.2 Cross-jumping The secondary debugging functions that are affected by cross-jumping are single-stepping inside merged regions and displaying the values of variables inside merged regions. 6.5.2.1 Single-stepping Actually, single-stepping inside merged regions works reasonably correctly with the algorithms described. Because determining breakpoints collect control-flow information in a way that can be used by any debugging activity, as long as a region's determiners are activated early enough, single-stepping can use the information also. This property need not have been true. Another way to implement path determination would have been to have a determining breakpoint collect control-flow information for a specific future point. Then it would have been harder for some other place, even though inside the same merged region, to use that information. The "early enough" issue also works properly. Since we require that a path determining instruction is a sentry for the merged region, and a statement-level single-step always executes at least one instruction, it is "early enough" as long as execution has not already entered the merged region. Obviously, if single-stepping is requested after the merged region has already been entered and the determiners were not activatedthat is, the user has neither set breakpoints inside the region nor issued a SUSPECT command for the enclosing procedurethen the debugger can report only a list of source alternatives at the next stopping point. In fact, since statement boundaries need not occur in the same place on two merging paths, in this situation the next stopping point might be inside a statement on one path but at a statement boundary in another. The debugger would have to stop at all such locations. 6.5.2.2 Displaying values of variables If two merged paths each declare local variables, the runtime path determination algorithm's choice of the correct source statement must be used to access the proper variable identifiers and type information for the correct path. As previously noted, the association of variable identifiers with variables is dependent on the correct source statement, while the association of variables with storage locations is dependent on the correct object code location. If the debugger were to use the object location to access the symbol table information for a variable, it might find the wrong variable declaration. 6.6 Summary The description of the implementation details and rationale behind the Navigator system is now complete. The next chapter discusses the correctness and performance of the Navigator path determination algorithms. 2CHAPTER 6: COMPLETE NAVIGATOR ALGORITHMS Ê>–"thesis" style–;(firstPageNumber) 129 .def (firstHeadersAfterPage) 129 .def˜Iblock•Mark centerHeaderš ÐlsÏsžœžœžœž ™)J™chapheadšœ)˜)I thesisbodyšœÌ˜ÌMšœ-Ïiœˆ˜Ç—headšœ˜M˜ºMšœ¨˜¨MšŸœk˜ˆMšŸœÒ˜ëI pagebreakšœ˜Icenter• ArtworkClass IncludePressšœZ˜ZI bigcenteršÏb-˜-MšŸ"œùž œ=˜áMšŸ œ{˜„˜MšœÙ˜ÙMšœ¤Ÿœe˜—Mšœ™˜™Mš œ™Ÿ œ ŸœdŸœŸœUŸœŸœù˜˜Mšœû˜û—˜-Mšœ†˜†MšŸ-œê˜—MšŸ)œöÏfœž˜¶MšŸ-œŒ˜¹——˜MšœÌŸœPŸœ¨˜öMšœjŸœžœÚ˜ÌMšœ…˜…Mšœ¦˜¦P– IncludePressšœV˜VQš I˜IMšœ¡ œv¡ œ8˜á˜Ltalg1šœ˜talgš ŸÐfi Ÿ¢ ŸÑfiz Ðiz£ Ÿ¢Ÿ˜ƒSšŸ¢ Ÿe˜…SšŸ ¢ Ÿ[˜qSš¤…£ ¤T˜åSš¤F£ ¤Y˜©— šŸ¢ Ÿ*¢ Ÿ˜TSšŸ¢ Ÿ¢ ŸN˜š——Ršœ/˜/ šœ˜ šŸ$˜$SšŸ ¢ Ÿ$˜7SšŸ ¢ Ÿ8¢ Ÿ˜nSš¤ £ ¤>˜SSš¤ £ ¤~˜‘— šœ'˜'Sšœ 6œˆ˜Ã šœ 3œ˜:Sšœ°Ÿ¢ Ÿ.¢ Ÿ-ÐiuŸÆ˜îSšŸt¢ Ÿ¢ Ÿ}˜™— šœ 'œ˜.Sšœq˜qSšŸ¢ Ÿ"¥Ÿ˜GSšŸ.¢ Ÿ ¢ Ÿ+˜x——S˜9—S˜—RšÏz˜SšÏuœ5¡ œ.¡ œ1˜©Sš§œ9¡ œ9¡ œ˜¡˜IMšœOž3œµ˜·˜ Mšœ¯˜¯Mš œdÏkœ¾¡ œ0¡ œ6¡ œ5¡ œ0˜£MšœI¡œ¡œ*žœ!¡œ$¡œ¡œ3¡œ¡œ¡œ,¡œ˜ø—˜"Mšœñ˜ñMšœÕ˜ÕM˜øMšœé¡ œ+¡ œX˜€O˜P– IncludePressšœV˜VQš @˜@Mšœ…ŸœŸ œ¨˜ÛMš œœŸœŸœUŸœŸœt˜‚Mšœd¡œq¡œ˜ß——˜@M˜œMšœµž œ²žJœ“˜ÏO˜P– IncludePressšœZ˜ZQš =˜=Itblockšœ’˜’M˜’——šœ˜MšœÏ˜ÏMšœœ˜œMšœ¯ ˜¯ šœ$˜$Mš'œ!Ÿ œŸœŸœŸœ&Ÿœ Ÿ œŸœŸœŸœ#ŸœŸœŸœŸœ(Ÿœ½Ÿ œŸ œ5ŸœŸœÉ˜åMšœŸœè¡œ¡œ*¡œ¡œ ¡œ ¡œ°˜“O˜—˜;MšœÜ˜Ü˜#Mšœù˜ùMšœB˜BIdsindentš œŸœIŸ œŸ œYŸœ!Ÿ œ{˜†UšœŸœ¤˜¬UšœŸœE˜OMš œ†Ÿ œ Ÿ œµŸ œ3ŸœÀŸœ_Ÿœ˜üM˜IItenumšÑmvxœ›˜œVš©œ€˜Vš©œ]˜^Mšœ€˜€MšœÅ˜ÅO˜P– IncludePressšœ\˜\Qš <˜M˜ˆ—˜NM˜¥T˜£Mšœ²Ÿœ§˜öMšœ“Ÿ%œ‹˜ÃM˜†MšœŽ˜Ž——˜PM˜«˜NMšœŸœÇ¡œ9Ïx¡œ˜°Mšœ¡¡œ³¡œ¸˜šMšœ  œµŸœ›˜ßMšœ³¡ œü˜¹O˜P– IncludePressšœV˜VQš ,˜,P– IncludePressšœX˜XQš I˜IP– IncludePressšœW˜WQš Hœ˜ITšœË˜ËMšœƒŸœ¯˜ÉMšœB¡ œ*¡Ðdfœ¡œ¡²œ ¡œ¡²œb¡²œ7±¡²œ˜ñMšœÕ¡œC¡œ±¡œ˜¢šœq¡²œ¡²œ`˜×Uš œ¡²œÏmœ±¡²œ˜.Ušœ˜Uš œ¡²œ³œ±¡²œ˜/—MšœE¡œz˜ÀO˜P– IncludePressšœV˜VQš \˜\Tšœ0¡œ`˜‘Mšœù˜ùM˜ûM˜º—˜O šœ˜Sš ŸU¢Ÿ ÑfijŸ´¢Ÿ'´ Ÿ ¤x˜ªSšŸB¢Ÿ>¢ Ÿ¤,£ ¤˜Ô—RšœŸœß˜òR˜B ˜(SšœA˜A ˜[ ˜$Sšžœ  ˜SšœÈ˜ÈSšŸG˜G šœJ˜J šŸ&¤ Ÿ ˜= šŸ5˜5O˜SšŸ´Ÿ ®Ÿø¤ Ÿn˜”SšŸG¤ ˜ç—SšŸ@¤ Ÿm˜¶——S˜D šœŠ˜ŠSšžœ ˜SšŸ0¤7Ÿƒ¤6˜ Sš¤§˜§SšœJ˜JS˜xSšžœ ˜— šŸF˜FSšŸ³˜³SšŸL¢Ÿ¤;˜ª—SšœŸ œ˜ž ˜SšŸ¤˜¤— šŸ8˜8SšŸ\¢Ÿ¤:˜§—Sšžœ  ˜O˜————šœ Ÿœ/˜CMšœ@ž®žœë˜ÎMšœÔžrœÌŸœÃ˜Ü Mšœœ˜œO˜P– IncludePressšœ\˜\Qš <˜¡²œD¡²œd¡²œ¡²œ¡²œiŸœ¡œh¡²œk¡œ^˜¥Mšœ¦˜¦Mšœ‡˜‡O˜P– IncludePressšœX˜XQš &˜&P– IncludePressšœ[˜[Qš )˜)—˜Mšœ‹¡œ®˜º—˜Mšœ"¡œ7¡œ¡œ*¡œ ¡œD¡œˆ¡ œ ¡œ…˜’——˜A˜MšœŠ¡œy˜†Mšœ`ž œÑ˜½Mšœ›˜›M˜ùKš¦˜Zšœ¦œ¦˜.Yšœ:¡ ÑdfmÐfv¶·¶œ\Ðpsœ3¸œKžÏdž¹˜³Y˜Zšœ¦œ¦˜MYšœ"¸œ¸œž¹ž¹œž¹ž¹œž˜‡Pš ;˜;Kš¦˜MšœŸœœ˜ÇMšœ5˜5—˜>Mšœì˜ìO˜—˜(M˜ò——˜EM˜ß———˜=Mšœ•˜•O˜˜ Mšœ×˜×˜SMšœÅ˜ÅMšœk˜kMšœ–˜–M˜ÉO˜T˜Û—˜/MšœÔžœð˜ÊM˜ûMšœážP˜±—˜Cšœâ˜âWš©œQ˜RWš©œS˜TWš©œ›˜œ—Mšœ×˜×Mšœ®˜®MšœŠ˜ŠMšœú˜úO˜—˜/MšœÊ¡œa¡œH˜õ——N˜˜¬˜MšœúŸœƒžœŽ˜š O˜—˜&M˜â———˜ M˜Ô——…—ª¾Œ