1. Specifying New Notation
Problem: Specifying new notation is difficult and tedious.
I believe the major underlying problem is that it is hard to imagine, specify, and verify generic rules which work in all situations. I think that instead of attempting to create bullet-proof all-purpose rules, it may be much better to facilitate the classification and specification of exceptions. I believe a class hierarchy with inheritance is one interesting way to approach this problem.
Each expression class (i.e. operator; e.g. product, integration) would be placed in a hierarchy which contains all expression classes. A subclass should inherit information from its superclass, yet be allowed to differ somewhat from it. (Sound like Smalltalk?) For example, class product may have binaryOp as its superclass. A product is indeed a binary operation, but in some (most) cases the operator is implied. Thus for formatting purposes, a product would behave much like all binaryOps, yet it would exhibit some subclass-specific behavior. The construction of such a class hierarchy in a reasonable way seems to be a non-trivial problem. Once most common notations have been specified, however, it should be relatively painless to subclass expression types which are similar.
Furthermore, each argument in each expression class should be required to have a set of constraints which express sizing and positioning information for every possible expression class that may appear as that argument. For example, in a summation, each argument in (summand lowerlimit upperlimit) must have constraints specifying what happens when the expression type of that argument is each of (sum, product, summand, integration, fraction, etc.). The obvious problem with this scheme is that the number of sets of constraints grows faster than n^2 in the number of expression classes, which is simply ridiculous. However, if you think of the expression class hierarchy as a tree, each argument in (summand lowerlimit upperlimit) only needs enough sets of contraints to "cover" the entire tree. Suppose, for example, that it was necessary for the lowerlimit argument to have special formatting behavior for relations (like =, >, <, etc.) and products. Then only three constraints would need to be specified -- one for the root class "op", one for class "relation", and one for class "product". In fact, if these constraints were mostly similar except for a few details, then inheritance could be used to make specification even easier - the "product" specification could inherit those constraints from "op" which weren't locally overridden by "product"-specific constraints. (I realize my examples here are horrible, but I hope the point is still clear.)
Another problem is that it is not only necessary to know which class of expression appears for each argument, it is also necessary to know quite a few other metrics, such as the size and maybe the "complexity" of the expression. Perhaps these could be handled in a similar hierarchical fashion; I don't know what the best approach would be (seems like a reasonably good research topic).
In the beginning of this section I mentioned that it was also difficult (and tedious) to imagine and verify sets of constraints. I believe that an interactive graphics-oriented constraint specification tool would greatly reduce these difficulties. The tool should allow the user defining a new set of constraints to display sample expressions using his constraints as they evolve. Perhaps a library of both "common" and known "hard to deal with" expressions could be maintained and the tool could allow the user to quickly browse through many sample expressions using these expressions as arguments to the notation being defined. This would certainly make creating and maintaining a library of notation much more enjoyable and reliable than a non-interactive or non-graphical constraint specification system.
Overall, I believe that the problem of specifying notation is one of the most important issues in the design and implementation of a math expression editor.
MEDDLE's simplistic approach works adequately in many cases, but as the number of expression classes increased, rules became increasingly difficult to specify correctly. It is also true that the kind of constraint specification system described here is probably beyond the scope of a typical summer project.