<<CHAPTER 8:  CONCLUSIONS AND FUTURE WORK>>

Chapter 8

Conclusion

This dissertation has examined the conflict between optimization and debugging, culminating in the design and implementation of 
Navigator, a prototype debugging system that handles two simple but nontrivial optimizations.  The Navigator system demonstrates 
that effective interactive source-level debugging can be provided for control-flow optimized programs in real production 
programming environments.  Solution techniques developed in Chapter 4 suggest that this result can be extended to the general class 
of optimized programs in the future.
8.1 Summary of results
Optimization destroys the straightforward correspondence between a program's source text and its object code.  The techniques 
described in general in Chapter 4 and expanded in the Navigator chapters provide for the correct implementation of interactive 
source-level debuggers in the presence of many common optimizations.  The main job of such a debugger is to help users debug 
optimized programs without forcing them either to deduce the effects of the optimizations or to disable optimizations by 
recompiling their programs.  Furthermore, the techniques used to accomplish this job must not nullify the improvements achieved by 
the optimizations.
As discussed in Chapter 1, this dissertation addresses this need in three ways:

programs.
programming environment.
The remainder of this section elaborates on these contributions.
8.1.1 Interactions between optimization and debugging
Chapters 2 and 3 investigate the essential conflict between optimization and debugging.  Chapter 2 focuses on the debugging half: 
what does a user expect the debugger to tell her about an executing program?  Two important levels of debugger behavior for 
optimized programs are defined: expected behavior, in which the debugger hides the effects of the optimizations from the user (by 
doing extra behind-the-scenes processing), and truthful behavior, in which the debugger admits to the user that the executing 
program differs from the source program and that the answer to a debugging query is not completely known.  When the debugger 
provides truthful behavior, partial information may be available and the user may be able to participate in deducing the correct 
answer.  This characterization is a useful tool when designing a debugger for optimized programs.  The debugging support suggested 
by most previous researchers fell short of even truthful behavior.
Chapter 3 focuses on the optimization half: how do various optimizations affect the aspects of a computation that are of interest 
to the user?  First of all, Chapter 3 provides a thorough catalogue of the problematical effects of optimizations on debugging 
capabilities.  This catalogue will be a useful reference for future research in this area.  Another important contribution of this 
chapter is the distinction between syntactic and semantic definitions of the mapping between source and object code locations for 
computation-reordering optimizations.  Neither mapping alone is sufficient to answer all common debugging queries.  [Recall that 
the syntactic mapping reflects a statement's position with respect to other surrounding statements and with respect to the degree 
of progress through the entire computation, while the semantic mapping reflects the position of the actual computations specified 
by the statement.]
8.1.2 General solution techniques
Chapter 4 provides a variety of solution techniques that can be combined to combat the problems identified in Chapter 3.  The 
techniques fall into three categories: basic techniques, techniques that support truthful behavior, and techniques that support 
expected behavior.
The basic techniques support both truthful and expected behavior.  The compiler can collect static information about optimizations 
in improved symbol tables, statement maps, and special-purpose tables.  Static information of this kind is mandatory for providing 
effective debugging for optimized programs.  In addition, the debugger can collect dynamic information in the form of old values of 
variables or control-flow history.  The compiler can use improved variants of optimizations that impact debugging less.  Two 
fallback positions are restricting debugging slightly and recompiling a small portion of the program.
To provide truthful behavior, the compiler can show the user the effects of the optimizations in a new version of the entire source 
program or the debugger can present a localized report of the changes affecting a given program point.  To support debugger 
variable modification, the new source must be quite detailed and the user must be very motivated.  Another approach to truthful 
behavior is to provide the user with specialized debugging commands that can be applied to optimized programs.  For example, 
syntactic and semantic versions of breakpoints are essential to providing the functionality of a conventional debugger in the 
presence of computation-reordering optimizations.
To provide expected behavior, the debugger must process the additional static (and possibly dynamic) information to hide the 
effects of the optimizations.  The necessary algorithms often resemble flow analysis algorithms.
8.1.3 Navigator: demonstration of practicality in a limited setting
Chapter 5 presents an overview of Navigator.  Navigator restricts its view to two control-flow optimizations and a few basic 
location-oriented debugging capabilities.  Its design goal was to provide expected debugger behavior.  For inline procedure 
expansion, Navigator provides expected behavior for the selected location-oriented debugging functions.  For cross-jumping, 
Navigator provides expected behavior for breakpoints and high-quality truthful behaviora list of the possible source 
alternativesfor unplanned debugging activity (that is, program exceptions and procedure tracebacks).  This reduction in behavior 
was necessary to meet the system performance goals:  program execution space and speed are almost totally unaffected when no 
debugging requests are active.  When debugging is requested, Navigator provides its added functionality without noticeable 
degradation in debugger response time for most programs.
Chapter 6 describes the Navigator implementation, including the complications caused by the language, the system environment, and 
multiple optimizations.
Chapter 7 proves the correctness of Navigator's compiler and debugger path determination algorithms.  It analyzes the efficiency of 
these algorithms on multiply-merged regions and discusses improvements to them.
8.2 Lessons learned from Navigator
Constructing an interactive source-level debugger for optimized programs is possibleit is not easy.  The basic ideas behind 
Navigator are quite simple.  The compiler collects static information about "deleted" calls to inline procedures and about path 
determiner locations for regions merged by cross-jumping.  It also maintains improved statement maps describing these changes.  The 
debugger collects dynamic control-flow information at path determiner locations via breakpoints that are invisible to the user.  
However, the implementation is less than straightforward.  Many details must be considered, including features of the language, 
such as nested inline procedures and imported inline procedures, and system environment details, such as recursion and multiple 
processes.
The interactions between optimizations caused two major problems.  First, it was necessary to ensure that the debugging solution 
for one optimization was not destroyed by a later optimization.  Ideally, the effort of designing debugging solutions to 
optimizations should be roughly linear in the number of optimizationsthat is, the solution for each optimization should be 
separable from both the application of the remaining optimizations and the debugging solutions for the remaining optimizations.  
This can be done by designing appropriate ways to propagate debugging information through the compiler, particularly the existence 
of key instructions or locations such as path determiners.  In Navigator, this responsibility was concentrated in the routines for 
inserting and deleting code cells, jump cells, and label cells from the codestream.
Second, the interactions between optimizations made it hard to determine what programs could look like during the application of an 
optimization.  As a result, it was difficult to arrange for the Navigator algorithms to report the correct source alternative in 
every cross-jumped region and to prove that they do so.  In general, it was necessary to assume that the input to the cross-jumping 
and other object code optimizations was an arbitrary flowgraphnice graph properties that the source program satisifies (such as 
the reducibility of the flowgraph) could not necessarily be guaranteed.
Preliminary tests of the completed Navigator system indicate that the modified compiler and debugger work reasonably well.  
However, as described in Chapter 7, the path determination algorithm as implemented can be quite inefficient for programs with many 
nearly identical code sequences that join.  Although rare, such programs do occur in practice: the actions for common table-driven 
parsing algorithms often have this structure.  The improvements suggested in Sections 7.3 and 7.4, particularly adding a path 
determiner minimization pass or grouping determiners into equivalence classes, would handle these unusual cases better.
Attempting to provide expected debugger behavior for an optimization gives a much clearer insight into the precise effects of the 
optimization.  Cross-jumping is a very easy optimization to implement: it requires only the connectivity of the flowgraph and the 
ability to compare instructions.  Nevertheless, it crosses basic block boundaries and can have surprisingly complex effects upon a 
program's control flow.  These effects had not been studied before.
Attempting to provide expected debugger behavior for an optimization also gives a different view of the optimization.  Classical 
optimization research characterizes each optimization by the amount of information that is needed to establish its preconditions 
and correctness and by whether that information must cross basic block boundaries.  The ability to debug programs resulting from an 
optimization is an aspect that should receive more attention in the future.
It is interesting to note that truthful behavior for an optimization like cross-jumpingeven high-quality truthful behavioris much 
easier to provide than expected behavior for that optimization.  Additional experience with users might reveal that truthful 
behavior is sufficient, and that therefore the compiler and debugger need not work so hard.  Nonetheless, the implemented 
combination has a satisfying feel.  When the user requests a breakpoint, Navigator guarantees that the exact source alternative 
will be known (unless control has already entered the merged region).  When a program exception is encountered, the user can try to 
determine the exact source alternative from the list she is given.  If this is impossible, the user must abandon the current error 
context but need not recompile the program to get expected behavior: setting a breakpoint at the exception location will do the 
trick.  If the user anticipates problems with a particular procedure in advance, the special SUSPECT command will ensure expected 
behavior without requiring specific breakpoints.
8.3 Future Work
This dissertation has only scratched the surface of the problem of debugging optimized programs.  Two areas of further research 
hold promise: studying and extending Navigator, and investigating broader aspects of the problem.

8.3.1 The short term: studying and extending Navigator
Time constraints and other practical considerations limited the amount of experimentation and measurement that could be performed 
on the completed Navigator system.  As a result, exact  measurements of compilation and debugging response time for a broad range 
of programs (including some particularly hard cases) could not be collected.  Preliminary tests showed that the techniques 
performed satisfactorily, but the theoretical analysis presented in Chapter 7 points out problems with excessive path determiner 
growth in complex multiply-merged regions.  Further study is needed to determine the effects of this growth in real programs.  The 
path determiner algorithm would be more efficient if some of the improvements described in Chapter 7 were implemented, particularly 
adding a path determiner minimization pass (which has yet to be completely specified) or grouping determiners into equivalence 
classes.  The compiler and debugger performance, as well as the sizes of the debugging tables, could then be measured over a wide 
class of programs and debugging scenarios.
One could also study the effects of different methods of handling ambiguity and other troublesome situations.  [These different 
methods were described in Chapter 6.]  How often does the cross-jumping algorithm have to insert extra null instructions to handle 
the joining jump's source or sentry information and how much does this cost at runtime?  [Preliminary tests show that these numbers 
are small.]  How much of cross-jumping's benefit would be lost by inhibiting cross-jumping whenever one of these troublesome 
situations arose?
Navigator could also be extended to handle the secondary debugging functions, as suggested at the end of Chapter 6.  For inline 
procedure expansion, these functions include displaying the values of local variables or parameters of inline procedures, setting 
breakpoints at procedure entry and exit, calling an inline procedure from the debugger's interpreter, and single-stepping with 
coarse granularity.  For cross-jumping, the only affected secondary debugging function is displaying the values of variables.
8.3.2 The long term: investigating broader aspects of the problem
The solution techniques used in the Navigator implementation are also applicable in other situations.  It is likely that macros in 
source programs could be handled by a method similar to Navigator's handling of inline procedures.  Furthermore, as discussed in 
Chapter 4, the invisible breakpoint technique for collecting dynamic information could be used to collect data information for 
handling the code motion optimization.  Alternatively, if the code motion optimization were altered to save the old value every 
time, dynamic control-flow information similar to the path determiner information could be used.
In addition, the Navigator system tested only a few techniques from the rich assortment proposed in Chapter 4.  The remaining 
techniques deserve attention, both individually and in concert.
There are several unexplored points regarding static information: exactly what to save for a given set of optimizations, how to 
generate and store it efficiently in the compiler, and how to access it efficiently from the debugger.  The trick of storing some 
of it and recreating the rest on demand through recompilation could prove useful.
Regarding optimization restrictions, special attention should be paid to the development of new variants of the classical 
optimizations that provide similar optimization benefits but impair debugging less, such as the new variant of code motion 
discussed above.  Once designed, implementations of these new variants should be measured to determine their compile-time 
efficiency and their relative effectiveness at reducing code size and execution time.
There are also unanswered questions concerning the provision of truthful behavior:
the entire program would be less usefuland more overwhelmingthan a localized view of an area of current interest, such as a 
proposed breakpoint location.  User studies would shed light on this issue.
breakpoints and a few other examples.
The level of user acceptance of high-quality truthful behaviorthat is, a localized description of the effects of the 
optimizations, lists of possible alternatives for variable values and program locations, and good debugging commandsshould also be 
studied.  High-quality truthful behavior is generally much easier to provide than expected behavior, as Navigator illustrates.  It 
may be sufficient for the construction of a very effective debugging tool.
Overall, it is clear that there is a useful middle ground between fully optimized programs with no debugging capabilities on the 
one hand and completely debuggable programs with no optimization on the other.  The selection of a suitable level of use for each 
technique, hiding the effects of some optimizations and displaying the effects of others, thus providing the optimal mix of 
optimization and debugging capabilities, is a particularly important area for future research.
Finally, theoretical studies of the effects of limited groups of optimizations (using specific models of optimization) upon 
debugging information or computational behavior would provide useful support for the more practical studies suggested above.  
Interesting optimization groups include local optimizations using a quad model, local optimizations using a dag model, and code 
motion alone.
8.4 Closing remarks
This dissertation has shown that although optimization and debugging conflict, program development systems can provide effective 
debugging for many forms of optimized programs.  Such debuggers can reduce debugging time and programmer confusion.  These benefits 
are especially important given the increasing availability of optimizing compilers.