Generation of static CMOS layout from boolean expressions Abstract: We present a new way to generate CMOS layout from a set of boolean equations. This layout generation, that we call Alps, makes use of an original tree-structured representation of arbitrary boolean expressions. This representation is usually more compact than the classic disjunctive form, is suitable for fast symbolic manipulation, and maps naturally into silicon. We describe an original way to do this mapping, and to produce static CMOS layout using the cascode switch style. Alps generation is a new alternative to PLAs. Keywords: Boolean Algebra, Boolean expressions, Disjunctive Normal Form, VLSI, PLA, CMOS, DCVS, Cascode, Static, Automatic Layout Generation. 1. Representation of Boolean Expressions 1. Conventional representations Throughout all this proposal, we represent the AND operator (product) by ".", the OR operator (sum) by "+", and the NOT operator (Complement, Negation) by "~". The exclusive OR is denoted XOR. The constants are "0" (false) and "1" (true). An expression is made of operators, sub-expressions, constants and variables. We will often use the IF operator (Multiplexor) which selects one of two sub-expression. In terms of AND and OR, IF can be defined by the relation: IF [a, b, c] = a . b + ~a . c . A traditional way to represent expressions is Disjunctive Form. Any expression can be viewed as a sum of products, each term of the sum being made of a product of either an input variable or its complement. The decomposition is unique if each product contains all variables (possibly complemented). Without this condition, the decomposition is far from unique and many papers describe various techniques for minimizing the number of product terms. This minimized representation is usually compact enough for a large variety of problems, especially if XORs are avoided. It is the representation used for generating PLAs and to some extent Weinberger's Arrays. Another well-known way to represent a function of N variables is just to decompose it into 2 partial functions of N-1 variables, with a boolean choice operator between those partial functions. This gives: F [x1, x2, ..., xn] = x1 . F1 [x2, ..., xn] + ~x1 . F0 [x2, ..., xn] We see that this is exactly the IF operator introduced previously: F [x1, x2, ..., xn] = IF [x1, F1 [x2, ..., xn], F0 [x2, ..., xn]] Assuming that an order in the variables has been chosen, we will abbreviate as: F = x . F1 + ~x . F0 This simple representation is called in the literature Binary Decision Tree (see for example [Akers78]). The operations on binary decision trees are simple. However the representation has a major drawback: its size, since the size of any expression of N inputs is 2N. 2. Alps normal form The major originality of Alps representation is to distinguish special cases of the partial sub-expressions F1 and F0. These two sub-expressions can be 0, 1, identical or opposite. That leads us to 9 cases (conditions are to be evaluated sequentially): F1=F0=1 OR F1=F0=0 [Constant] => F=0 OR F=1 F1=~F0=1 OR F1=~F0=0 [Identity] => F=x OR F=~x F1=1 [OR] => F= x + F0 F0=0 [AND] => F= x . F1 F0=1 [ORNOT] => F= F1 + ~x F1=0 [ANDNOT] => F= ~x . F0 F1=F0 [NoDependence] => F= F1 F1=~F0 [XORNOT] => F= x . F1 + ~x . ~F1 ELSE [IF] => F= x . F1 + ~x . F0 When this distinction between all different cases is made, the decomposition is still unique (given an ordering of the inputs), and the composition laws are unaffected. We thus have a normal form (Alps normal form). The size of the representation is no longer 2N, as in pure binary decision trees, because in all cases, apart from [IF], at most one sub-expression is needed. For example, NOR, NAND, AND, OR of N inputs all take a size N, as in the sum of product form. However, the sum of product form of a N-input XOR is of size 2N, versus size N for the Alps representation. An example is shown in figure Example. The Alps normal form is dependent on the ordering of inputs. In order to have an efficient representation we therefore have to find a good permutation of the inputs. The exhaustive search of all possible permutations is a theoretical answer to that minimization problem. In practice, a local search of better permutations, done by permuting only 2 inputs at a time, seems to be sufficient. Figure Representations shows a comparison between optimized sum of products and permuted Alps representations for some examples. It is possible to generalize Alps form by allowing sharing of sub-expressions, that is to manipulate a DAG instead of a tree. This is the representation proposed by Bryant [Bryant85]. It is slightly more compact than Alps, but the mapping to layout is then more complex, because it is the mapping of a graph in 2 dimensions instead of the mapping of a tree. 2. Mapping into silicon The mapping of an Alps expression tree into a two-dimension plane is easily done by fixing all inputs to run in the same direction (say vertical), and all sub-expressions or expressions in the opposite direction (horizontal). Conceptually, at each intersection of an expression with an input there is a gate implementing one of the 9 possible cases of node. Figure Mapping gives a general schematics of the mapping. Even though every gate realizes a simple function, a direct implementation in MOS technologies will be prohibitive in area without any further enhancement. We are now going to concentrate on CMOS technology, and show that using Differential Cascode Voltage Switch (Cascode) leads to very simple gates (usually 2 transistors per gate) and a fully static design. 1. Overview of different CMOS styles There is not an infinite range of styles, in CMOS, for realizing combinatorial circuits. Figure CMOS Styles assembles the four basic styles, i.e.: true CMOS, pseudo-NMOS, precharged and cascode. A precise discussion of those techniques can be found in many books, for example in [Hodges83]. Cascode, if used in a regular manner, offers the static quality of true CMOS, but with a smaller area, because there are only 2 p-devices per gate. It does not require important power-supply, contrarily to pseudo-NMOS, nor clocks and care for timing races found in precharged style. It has some additional nice features as fast switching time and free supply of both output and complemented output. But it is only valuable for large or medium-size gates (at least 3 or 4 inputs), and so we will describe in the next paragraph a way to group elementary Alps gates into larger units. 2. Cascode mapping In Cascode style all signals are carried on dual rails, i.e. both the signal and its complement are needed and produced. The mapping of logical values to voltage levels is not trivial in order to minimize the number of n-transistors per IF gate: 1 means pulled down to Gnd via a n-transistor, 0 means not pulled down to Gnd. Figure Gates Representation shows the schematics for the IF gate. All gates can be obtained by using the Constant gate (the one generating 1 and 0) and the IF gate, but one can spare one transistor in the cases [ORNOT], [AND], [OR] and [ANDNOT], using the fact that: F = x . F1 + ~x can be rewritten F = F1 + ~x and similarly for the other cases. The schematics for the ORNOT is shown on figure Gates Representation. Layout for this 2-transistor gate can be made very compact, and this is important because most of the gates used in the control part of a chip belong to the {[ORNOT], [AND], [OR], [ANDNOT]} set. In addition to the 2 pull-down chains, the cross-coupled pull-ups are needed to compute the output. Very long pull-down chains made of transistors in series may give a quadratic delay. To avoid this, cross-coupled pull-ups are inserted at regular intervals (every 5 or 6 stages is our current choice). The schematic is given in figure Amplification. These cross-coupled devices make the delay time truly linear in the number of inputs. The algorithm used to choose where to put repeaters is slightly better than just dropping them at regular intervals: the interval is a fixed number of used inputs. 3. Cascode Layout details Many strategies have been tried to find the most compact way to layout Alps expressions. This discussion is very technology-dependent, but seems reasonable for any double metal CMOS technology. Because n-transistors have gates perpendicular to source and drain, it seems best to have inputs run in one dimension in poly (stitched to with metal2), and outputs run in the other dimension in metal (for minimizing delays) and diffusion (where transistors are needed). N-wells are inserted are regular intervals, for making the amplifier cascodes, which means some extra Vdd and Gnd running parallel to the inputs. We chose to have Gnd run in metal alongside the outputs. 3. Results On the set of examples we used, the area comparison with PLAs is in favor of Alps for small or medium size problems (between 10 and 20 inputs and between 10 and 20 outputs). For large problems Alps-generated layouts seem to be around 50% larger. Alps is denser than PLAs for laying out finite-state machines. 4. Originality Binary decision trees for representing boolean expressions is known since a long time. The optimization we make in distinguishing all different cases was unknown previously, according to our knowledge. The representation proposed by Bryant [Bryant85] is a generalization of the proposed representation that is less suitable for generating layout. The use of Cascode in CMOS circuits is well-known (for example [Yoffa85]), but the mapping of our boolean representation is original (since the representation is original). It is somehow analog to the mapping of disjunctive normal form into PLAs. 5. Summary We have presented a new layout generator which uses a novel representation of boolean expressions. This representation is in general more compact than the sum of product form, and facilitates fast symbolic manipulation. Our experience is that Alps representation is a data structure universal enough to be used whenever one needs symbolic manipulation of boolean expressions. The original mapping of Alps structures into layout that we presented is straightforward, particularly in the context of CMOS Cascode. The generated cells are comparable in area and speed to PLAs, but offer the advantage of being completely static. Alps layout flexibility can be used by the designer for generating small combinatorial cells, and the combination of Alps and routing might be an alternative to standard cells systems. 6. Bibliography [Akers78] : Binary Decision Diagrams Sheldon Akers, June 78, IEEE Transactions on computers, Vol C-27, No 6, pp. 509-516 [Bryant85] : Symbolic Manipulation of Boolean Functions Using a Graphical Representation Randal Bryant, June 85, Proceedings 22nd DAC, pp. 688-694 [Chen84] : Considerations for implementing CMOS processors Chih-Liang Chen & Ralph Otten, October 84, Proceedings of the IEEE International Conference on Computer Design, pp. 48-53 [Hodges83] : Analysis and design of digital integrated circuits David Hodges & Horace Jackson, McGraw-Hill series in electrical engineering, ISBN 0-07-029153-5 [Yoffa85] : ACORN: A local customization approach to DCVS physical design Ellen Yoffa & Peter Hauge, June 85, Proceedings 22nd DAC, pp. 32-38 Ê–i(firstHeadersAfterPage) {1} .cvx .def (firstPageNumber) {1} .cvx .def (oneSidedFormat) {.false} .cvx .def˜ItitlešœÏkœ ˜9IabstractšÐbs œ#ÐbkÏbœ œ( œ œ± œœ> œ˜Lš ž œAœœœœ/˜Žheadšœ(˜(˜bodyšœ/œ œ œ8œœ˜œNœœœ ˜ÖNšœ˜ —Nšœ.Ïiœíœi˜—šœÍ˜ÍNšœD˜Dšœ œ ˜Bšœœ)˜AMšœO˜O—Nšœ˜——Nšœ7¡œÀÏuœ˜—˜˜ÿItable2šœœœ˜+Ošœ œœ˜.Ošœœ˜Ošœœ˜Ošœœ˜Ošœœ˜Ošœ˜Ošœ œ˜)Ošœœ˜"—Nšœ4ÏsœÒ¢œFœ7œœœœnœ ¢œLÐisœ˜íNšœ‡¡œý˜ˆNšœ¤œj˜€Nšœgœý˜ç——šœ˜Nšœï¤œ{œoœ§˜Žšœœ˜$Nš œ-œ0Ñiks¤œ-œ œz˜¤NšœHœŠœï˜É—˜šœîœ_¤œœbœ6œœœœ˜ÓNšœP˜PNš œœ¤œ¢œœœœ˜‰—NšœÒ¤ œñ¡œ˜Ü—˜Nšœ²œé˜Ÿ——šœ˜Nšœ¶˜¶—šœ˜NšœÜ˜ÜNšœœß˜ù—šœ ˜ Nšœ6 +œ˜˜ùNšœ /œFœ¶˜³—šœ˜bibliographyš $˜$Pš¡Ðik¡7˜S—š X˜XPš¡)¦¡ ˜9—š +Ÿ ˜:Pš¡>¦¡7˜y—š ?˜?Pš¡M¦¡˜_—š Ÿ $Ÿ ˜IPš¡5¦¡˜C———…—,n0ƒ