Automatic Layout of Silicon-On-Silicon Hybrid Packages Abstract In high speed computer systems, performance and density limits are being set more by interconnect and packaging constraints than by transistor switching speeds. The most severe limitation comes from the single-chip packages that carry the VLSI circuits. Multi-chip, silicon-on-silicon hybrid packages can significantly improve performance by eliminating this level of packaging. A system has been developed to automatically generate hybrid layouts given a schematic description of the required interconnection and layouts of the VLSI circuits to be included. This paper describes the hybrid technology, the design automation system foundation, and the hybrid layout system. The layout system is implemented as three layout generation procedures: die mount pattern generation, a general cell-like placement and routing procedure, and a pad frame generator. State-of-the-art algorithms are described but the primary contribution describes how the algorithms are combined to form an automatic layout system for this improved technology. System operation is illustrated through a comprehensive example: a high performance workstation IO subsystem containing 8 VLSI circuits implemented on a 2.6" by 3.2" hybrid package. This layout method, in combination with the improved technology, produces layouts that are 5 to 8 times more dense than the same circuit implemented with single-chip packages on a printed circuit board. Simulations show that clock speeds can be increased by a factor of two. 1.0 Introduction In high performance computer systems, the contribution of packaging to machine performance is becoming increasingly important. System performance and density limits are now being set by interconnect and packaging constraints. Signal delays due to transistor switching speeds have decreased dramatically while delays due to packaging have decreased only slightly. Demand for growing IO pin counts and greater packing densities along with increased performance requirements are making traditional packaging methods (single-chip packages on glass-epoxy printed circuit boards) obsolete. The most severe limitation comes from single-chip packages. Thus, multi-chip packages can significantly improve system performance by eliminating this level of packaging. Multi-chip packages can both increase system density and reduce signal delays due to interconnect and packaging. At Xerox PARC, researchers have developed a high density, multilayer interconnect technology using thin film metallization and polyimide dielectric on a wafer-size silicon substrate. Silicon integrated circuit (IC) fabrication equipment is used to lay down metal and dielectric layers on a silicon wafer. Eight to twenty (unpackaged) VLSI circuits can be mounted and interconnected on a rectangular area inscribed on a silicon wafer. A single cover provides environmental protection for all of the chips. The combination of multiple interconnection layers, small features, ease of testing, reliability, and high performance makes this silicon-on-silicon hybrid technology very attractive. Automatic layout synthesis at this packaging level is fully as important as the automatic synthesis of VLSI circuits. Layout synthesis for these hybrid packages has the characteristics of both printed circuit boards (PCBs) and silicon ICs. However, existing layout systems for neither printed circuit boards nor integrated circuits were able to lay out these circuits. PCB layout systems were unable to handle the high pin densities and the complexity of the circuits while IC layout systems were unable to generate layouts in the exact geometric shapes required. For these reasons, and because of the similarity of this layout problem has to general cell layout, we decided to enhance our existing general cell layout system to produce silicon-on-silicon hybrid layouts. The combination of this new fabrication technology, a relatively rich design automation environment and an existing general cell layout system allowed us to provide a new capability for silicon-on-silicon hybrid layout. Section 2 provides an overview of this technology and discusses the aspects that are important for automatic layout. Since the silicon-on-silicon packaging technology is relatively new, it requires more discussion than would normally appear. The existing design automation system provides a foundation on which to build automatic layout tools and greatly influences the design of the tools which the foundation supports. This foundation is described in Section 3. Section 4 describes the automatic layout system for silicon-on-silicon hybrid circuits. The system is composed of three layout procedures. The lowest level layout procedure produces die mount patterns (the leaf cells for this system) for the ICs. Given the physical design of an IC (especially the bonding pad locations) and the binding of the hybrid interconnections to the bonding pads, patterns for die attachment and wire bonds are automatically created. The intermediate level layout procedure places and interconnects the die mount patterns. General cell layout techniques are used; this task is greatly simplified by exploiting topological domain algorithms. The highest level layout procedure connects the main layout (defined by the intermediate level layout procedure and called the inner object) to the hybrid bonding pads. A minimal amount of routing overhead is required to translate from the topological domain layout to the geometric domain of the pads. The IO subsystem from a high performance workstation (composed of eight VLSI circuits) is used as an example through out the paper. Results from the layout of this circuit on a hybrid are presented in Section 5. 2.0 The Silicon-On-Silicon Hybrid Technology Packages provide the connection between ICs and the PCBs as well as environmental protection and a thermal path from the chips to the environment. Hybrid packaging can greatly increase performance of computer systems, but imposes special demands on layout systems. This section describes both the hybrid technology and the advantages of this technology over traditional packaging techniques. One goal of improved packaging is to fit more performance into the same volume [Bog87]. Silicon-on-silicon hybrids achieve this improvement through higher density (more integrated circuits for a given area), increased clock frequency (through lower interconnect capacitance and higher phase velocity media), and increased parallelism (more circuits per unit time are accessible [Bal85]). Traditional computer system packaging uses single-chip packages mounted on glass-epoxy PCBs. The PCBs are interconnected using backplanes (which often use PCB technology). The most severe limitation on system performance results from the use of single-chip packages. As the number of IO pins per chip increases, the maximum bond length (and hence the inductance) increases. Furthermore, single-chip packages cover correspondingly more area compared to the chip area. The geometric effects can be measured by packaging efficiency, Consider a chip carrier package with an IO pin pitch of Ppackage. Let the chip within the package have a bonding pad pitch of Pchip. In a pad limited package If the IC bonding pad pitch is 180 (PGAs) offer a high packaging efficiency for large IO pin counts << >> 1 Package dimensions are typically given in English units while IC dimensions are typically given in metric units. Silicon-on-silicon hybrids must be discussed in both worlds. We chose to follow convention at the expense of forcing the reader to deal with two systems of units. We provide metric translations of English units where appropriate. (where n is the number of IO pins), but this efficiency is not achieved for today's chips. For example, a 400 pin PGA with Ppackage = 0.1 inchs (2.54 mm) and Pchip = 180 mounted on hybrid die mount patterns approaches 100% because Pchip = Ppackage. (The pad pitch of the die mount patterns are the same as the IC bonding pad pitch.) This high packaging efficiency supports the claim that a level of packaging is skipped by silicon-on-silicon hybrid packaging. Another measure of the quality of a packaging technology is interconnect density, where there are M layers of interconnection with pitch Ptrace. When Ptrace is in inches, square inch of surface area. A high density PCB may have 10 signal layers with a trace pitch of 25 mils (635 400 inches of trace per square inch. The silicon-on-silicon hybrid technology which has 2 signal layers with a pitch of 50 have 1016 inches of trace per square inch of area. Wafer scale integration represents an ideal solution at this level of packaging. The interconnections among the active circuits are made using the same interconnect levels as the intra-circuit connections. However, wafer scale integration requires yields and process controls that are beyond those which are now available. Silicon-on-silicon hybrid technology [Joh86] can be viewed as a step toward wafer scale integration. This technology uses a wafer-size interconnect pattern to provide connections among ICs. Standard metallization methods are used to deposit the interconnects over a silicon wafer. Separately processed and tested integrated circuits are glued onto the silicon substrate at the appropriate places. Wire bonds connect the integrated circuit bonding pads and the appropriate pads on the hybird. A single cover protects the entire hybrid package. Bonding pads around the periphery connect to a PCB through TAB (tape automated bonding) or a specially constructed socket. Fabrication of silicon-on-silicon hybrids involves four levels of metallization separated by a polyimide dielectric. The lowest metal level forms a ground plane and covers the entire surface. Vdd is supplied through the second metal layer. This Vdd layer covers the entire surface except under the die mount patterns. The top two metal layers are used for signal interconnection. The signal traces have 25 from a 5" wafer. Such a package can hold 8 to 20 VLSI circuits and allows 420 IO pins around the periphery of the hybrid with 635 3.0 The Design Automation Environment The Xerox PARC design automation system provides both the user interface and the programming foundation for the hybrid layout system described in this paper. The characteristics of this design system had a large impact on the design and implementation of the hybrid layout system and thus, merits a separate discussion. Emphasis is placed on the layout synthesis portions of the design system. The user's primary interface with the design system is via an interactive graphics editor through which the user describes his or her circuit as an annotated schematic [Bar88b]. A schematic describes the hierarchical structure of a cell (expressed in terms of subcells). The user is free to intermix graphical and textual (procedural) specifications and to choose the most appropriate way to express each portion of the design. These intermixed specifications can be viewed as net list generators [Bar88a]. When a net list generator is evaluated (code is executed and/or a schematic is extracted) it returns the net list of a primitive device or merges results from subsidiary generators into a net list and returns it. Thus, evaluation of a single net list generator may cause many levels of schematic extraction and code execution. Layout generation is strictly a batch process. The layout of a cell is generated recursively from the layout of its subcells. These concepts are illustrated by an example: the layout shown in Figure 1 was generated entirely from the schematics shown in Figures 2, 3, and 4. The programmer's view of the physical design system is best understood by concentrating on layout generation procedures [Bar88c]. When a layout generator is invoked on (the schematic of) a cell it returns the object (geometric layout) of that cell. Invoking the layout procedure at the top of the design hierarchy starts a chain of layout calls since generating layout at one level usually requires recursively obtaining the layout of the cell's children. For efficiency, a cache is maintained so successive invocations of a layout generator with the same cell will return the same layout. Each cell in a schematic for which layout is to be generated is annotated with a layout key. Cells without a layout key are flattened. (The cell's net list is merged with that of its parent.) Layout keys have AttributeProcs, LayoutProcs and DecorateProcs associated with them. The AttributeProc defines the user interface of the layout generator by gathering parameters from the graphical information in the schematic. For example, an AttributeProc could use schematic decorations and properties to determine relative placement of subcells or the side on which a port becomes public. The layout generator next invokes the LayoutProc to construct the geometric object based on the net list and parameters obtained by the AttributeProc. Generally, a LayoutProc lays out the subcells by invoking their layout generation procedures. The subcell layouts can then be used as part of the layout of the subject cell. As the final step, a layout generation procedure invokes the DecorateProc to cache the object as a property on the cell and decorate the interface ports with graphical layout properties. These layout properties describe the position, size, and layer of public pins at the bounding box of the cell. This step also checks the consistency of the geometric layout and the cell definition. For example, every public port on the cell must have a corresponding geometric pin. At each level of the hierarchy, it is only necessary that a layout procedure understand the decorations of the cell's children. The current set of layout generation procedures varies widely in complexity. The most primitive procedures just return a copy of some predefined mask object (for example, a transistor, gate or flip-flop). Higher level layout procedures transform an object's orientation or abut objects together into rows, columns, or arrays. The next level of complexity for layout procedures comes with the addition of routing. Cells can be stacked with interspersed routing channels or arrayed together with interspersed switchboxes. A standard cell place and route procedure is available, as well as a datapath generator that allows multiple interleaving of individual paths. There is also a procedure to route an object within a pad ring and one to do a general cell place and route. The design system uses an object-oriented paradigm [Ser86]. A small number of data types is used to represent an annotated net list. These types have property lists to allow arbitrary annotation. Instead of a fixed set of device types, a class mechanism facilitates new abstractions. Procedural translation from one abstraction to another allows new abstractions to be added easily [Bar88a]. The hybrid layout system described in the next section is implemented as three layout generation procedures. The flexibility of this layout method made it quite easy to extend the notion of layout from an IC technology (2 metal layer CMOS) to a hybrid technology (4 layers of metal) while maintaining a common source schematic for all levels of simulation. 4.0 Hybrid Layout This section describes an automatic layout system for silicon-on-silicon hybrid packages. This system exploits the hybrid technology described in Section 2 and is built on the design automation system discussed in Section 3. A bottom-up description of the layout process follows. Layout is generated from a source schematic composed of at least three levels of hierarchy. A layout generation procedure is provided for the three levels. Figure 5 illustrates the areas of the layout of Figure 1 that were generated by the three layout generators. The lowest level schematic cells represent the VLSI circuits (the leaf cells) and are used to generate the die mount patterns. These patterns provide places to attach the ICs as well as bonding pads on the hybrid that will connect to the bonding pads on the IC. Section 4.1 describes this layout operation. Above the leaf cell generators is the layout procedure that generates the main placement and routing from one or more hierarchical levels in the schematic; intermediate schematic cells are flattened until only one hierarchical level exists between the top level and the leaf cells. This layout generator, described in Section 4.2, produces an inner object and is an adaptation of general cell placement and routing layout procedures. The highest level layout procedure is described in Section 4.3 and generates the bonding pads around the periphery of the hybrid and routes from these pads to the pins on the inner object. Separate layout procedures for the top level and second level schematics provide a clean separation of topological domain code from geometrical domain code. Only a minimal area penalty is required for this separation. 4.1 Die Mount Pattern Generation The DieMount layout procedure constructs a die mount pattern. In effect this pattern translates from the IC technology to the hybrid technology. This layout procedure is invoked once for each IC in the design. An example is shown in Figure 4. The icon summarizes the interface while the schematic defines the binding of the hybrid nets to the IC pads. The schematic and the IC layout are inputs to this layout procedure. A die mount pattern is the hybrid layout that corresponds to a particular IC. This rectangular region is large enough to hold the IC die and has around its perimeter one or two rows of die mount pads. As shown in Figure 6, bonding wires are used to connect the pads on the IC (die pads) to the pads on the mounting pattern (die mount pads). Although design rules allow two bonding pad rows, the pad row nearest the edge of the die was constrained to contain only Vdd and Ground connections in this example. The 4 hybrid metal layers are numbered 0 through 3. Layer 0 (the bottom layer) is always Ground and covers the entire surface. Layer 1 is usually Vdd except when used to connect signals to Ground as shown in Figure 6. Layers 2 and 3 are the signal layers. For the convenience of the router, all signal connections are provided at the periphery of the die mount in both signal layers. Die mount layouts in the hybrid technology can be acquired in two of ways. The first is to simply specify a previously constructed die mount from a library. The second is to apply the die mount generator to an IC for which the die mount pattern is to be generated. The die mount generator first extracts the IC layout to find the position and size of all pads. This information, combined with the signal names, is then used to determine the number of rows required as well as the individual pad types and locations. Figure 6 shows the various combinations of pad types and relative sizes. A typical value of w is 0.1 mm. The die mount pads are elongated to provide additional space for new bonds in case the ICs need to be replaced. 4.2 Hybrid Placement and Routing The layout procedure that generates the main portion of the layout, the inner object, is adapted from existing general cell layout procedures [Pre85]. Automatic, graphical, and interactive placement techniques are described in Section 4.2.2. Global and detailed routing algorithms are described in Section 4.2.3 and 4.2.4 respectively. The placement and routing algorithms operate in the topological domain; the layout is converted to geometric form as the final step. The topological model that serves as a foundation to the algorithms is described in Section 4.2.1. 4.2.1 The Topological Model The unifying concept throughout the hybrid placement and routing layout generator is the topological model [Cho85, Pre85]. In this abstraction only the relative positions of the IC primitives and the routing areas are important. The model automatically compensates for changes in the routing areas as they expand or contract to conform to the changing design. It is necessary to convert to the geometric domain only to compare alternative layouts during optimization or as the final step before generating the masks. The topological model reduces the complexity of hybird placement and routing by abstracting out the geometrical information (geometry). Computation of geometry is deferred until needed. The advantages of this approach are the following: removing and adding both components and wiring are simplified; topological operations are well defined; mapping the layout problem into the geometric domain is well defined and fast; routing areas can always be made just large enough to accommodate the required wiring since size determination is made after routing; and the complexity of dealing with masses of geometric data is obviated. The routing areas are organized into channels, as shown in Figure 7, which are maintained throughout the layout process. Channels divide the space into topological holes, each of which contains an IC. Channel widths may vary depending on the number of wiring tracks required. The channel lengths are determined by the positions of the channels they intersect at their extreme points. Exactly one channel is associated with each side of each IC. Three graphs are used to model the hybrid surface: the horizontal channel position graph, the vertical channel position graph, and the channel intersection graph. The channel position graphs are used to map between the topological and geometric domains while the channel intersection graph is used in topology modification and global routing. The horizontal channel position graph is a directed acyclic graph where arcs represent the widths (horizontal dimensions) of vertical channels or the horizontal dimensions of the IC die mount patterns. Figure 8 shows the horizontal channel position graph corresponding to the layout of Figure 1. The vertical channel position graph is analogous. The channel intersection graph is an undirected graph where the nodes represent the channel intersections and the arcs represent segments of channels between two intersections. Figure 7 shows the arcs (in red) of the channel intersection graph. Three classes of operation are necessary for the topological model: operations which map geometrical domain problems into the topological domain; operations which compute geometrical aspects given a topology; and operations which modify or manipulate a topology. The DefineChannels operation maps a geometrical domain layout into the topological domain. The input is a set of rectangles representing the ICs on a plane. First an outline is constructed around each rectangle; the outline always contains the rectangle and is made as large as possible without intersecting or containing any other outlines. Next a primary direction is chosen and all outlines perpendicular to this direction are collapsed along the primary direction into channels. Finally, the remaining outlines are collapsed along the secondary direction. The ICs and channels may need to be repositioned (using the Geometerize operation described below). Operations are provided to compute aspects of geometry: Size finds the length of the longest path through a channel position graph (corresponds to height or width of hybrid). CriticalArcs finds arcs that lie on any longest path through a channel position graph. Geometerize determines positions for the channels and ICs. All of these operations are linear in the number of arcs and nodes [Deo74]. Operations which modify or manipulate the topology are also provided by the topological model and are used primarily by automatic placement algorithms [Cho85, Pre85]. Examples of such operations are Grow which adds an IC, Shrink which removes an IC from the surface, and TtoL which modifies the channel intersection topology. 4.2.2 Placement Placement procedures determine the relative positions of the ICs and the public pins of the inner object as well as organize the routing areas into channels. Hybrid placement is both simpler and more complex than general cell placement within ICs. Hybrid placement is simpler because fewer components are involved (8-20 ICs per hybrid compared to 10-100 general cells per IC), and the sizes and shapes of ICs on a hybrid are less variable than are general cells within an IC. Hybrid placement is more complex because many interrelated factors determine layout quality. Wide buses are very prevalent; propagation delays and even power distribution are much more important. Two placement options are available to the user. An automatic placement algorithm built on the topological model is available. This algorithm finds a constructive initial placement, then improves the placement by iteration. This algorithm is described in [Pre85]. The second option uses graphical hints from the schematic to determine a floorplan. This option is popular because it provides a straightforward way for the user to specify a (topological) placement for a small number of ICs. In hybrid layouts there is often only one correct placement and that placement is obvious to the user. IC pinouts as well as public pads on the hybrid are often defined to accommodate a particular bus structure that further constrains placement. The procedure to support this second placement option is minimal given the topological model and the design system foundation. The DefineChannels operation from the topological model is invoked on the geometric domain layout defined by the IC icons in the source schematic. After the routing channels have been determined, actual layouts for the IC die mounts are substituted for the iconic representations. Routing channel topology may be modified interactively using the topology manipulation primitives of the topological model. The placement for the layout shown in Figure 1 was obtained by the second placement option (without interactive modification). Note the correspondence of the positions of the IC icons in Figure 4 to the die mount patterns in Figure 1. 4.2.3 Global Routing Global routing assigns segments of each net to a set of routing channels so as to reduce hybrid area and shorten connection lengths. Global routing assigns net segments only to channels and does not specify coordinates within the channels. Global routing for silicon-on-silicon hybrid packages differs in several important ways from general cell global routing. Vdd at Ground signals do not complicate global routing because these signals are distributed on separate layers, and taps to the power supply layers are easy to make. However, some problems are more complicated: overall dimensions must be tightly controlled (to fit within the package) and propagation delays must be carefully controlled. Our global router operates on a channel intersection graph (CIG). The problem of determining the global routing for a single interconnection net is mapped into finding a minimum Steiner tree through the channel intersection graph. The interaction of this net with other nets and with the hybrid surface is captured through the use of appropriate arc weights. The algorithm to route a two-pin net is equivalent to finding a shortest path between two nodes in CIG. For a multi-pin net the global routing of the net is more complex. The complexity of the problem grows rapidly as the number of pins to be connected is increased. The initial global route algorithm works as follows. Nets are sequentially selected for routing, and pins of the each net are mapped as new nodes onto the CIG. Arc weights are assigned according to the geometric lengths of the arcs. One pin is selected as the source and symmetric expansion is performed until any one of the other pins of the net is encountered. This subtree is then used as the source of the next expansion, and the process continues until all the pins in the net are visited. This approach guarantees that the path found between any two pins is of minimum length. However, it does not guarantee a minimum Steiner tree. In case of electrically equivalent pins, we insert all the pins into CIG, but as soon as one is included in the Steiner tree for the net, the rest are no longer considered as targets for expansion. In this way, the global router can make use of equivalent pins to minimize the routing length of a net. Notice that the initial global router need not address the order in which net are processed since each net is routed ignoring all other nets in the design. However, this may lead to a layout that is larger than necessary. An improvement phase follows which minimizes hybrid area by selecting and rerouting critical nets. The idea is to start with the initial global route and improve it by repeatedly selecting critical nets, removing them, and reconnecting them by alternate paths. Critical nets are those nets which, if removed, make the layout smaller. These nets pass through positions of maximum density along critical channels. A critical channel is a channel will allow the size of the layout to be reduced if its width is decreased. The arcs corresponding to critical channels lie on a longest path through a channel position graph. The global route improvement algorithm is outlined in Figure 9. FOR each net DO save current global information for this net; remove global route information for this net; compute ranges of maximum track density along all channels; determine channel widths based on maximum track densities; assign coordinates to channel position graphs (Geometerize); identify critical CIG arcs (CriticalArcs), and update their arc weights; insert pins of this net into CIG, and reroute it (ShortestPath); compute new hybrid area (Size); IF new hybrid area is reduced THEN accept; ELSE IF new hybrid area unchanged THEN IF number of critical channels is reduced THEN accept; ELSE IF number of critical channels unchanged THEN IF length of maximum density ranges along critical channels is reduced THEN accept; ELSE reject; ELSE reject; ELSE reject; IF accept THEN update hybrid area; ELSE IF reject THEN restore global route information for this net; ENDFOR; Figure 9. The global route improvement algorithm reroutes nets to reduce layout area. Topological model procedure names are shown in parenthesis. In order to determine the channel widths, we compute the maximum track density of net segments passing through channel. We can consider the maximum track density to be an approximation of the actual width required for detailed routing. Arc weight for noncritical CIG arcs are the geometric lengths of the associated channel segments. For each critical CIG arc, the weight is the geometric length plus the half perimeter of the hybrid area. The idea is that, for most of the nets, the maximum Steiner distance between two pins is less than the half perimeter of hybrid. CIG arc weights are dynamically updated from iteration to iteration. The improvement phase is, however, sequential; a heuristic is employed to obtain the effect of a look ahead. Although rerouting of some critical net may not reduce the hybrid area, it may increase the probability of a future reduction. After all there may be more than one critical path in the position graphs. Also, if we succeed in reducing the range of maximum track density along a channel, we may make one of the channel segments non-critical, or if we reduce this range to zero, we can save one track, thereby reducing hybrid area. We are investigating improvements to the global routing algorithm which use a dynamic path finding algorithm (the weights of the CIG arcs corresponding to channels are modified while finding the shortest path). Better results are also expected if a parallel global routing approach is used. 4.2.3 Detailed Routing The detailed router used in the hybrid layout system is based on the dogleg router [Deu76] but several extensions (both conceptual and in implementation) enhance the generality. Because so many channel routing algorithms have been published, only the important extensions are discussed here.2 The implementation extensions imply a more sophisticated model but are built on the theory presented in [Deu76]. The major conceptual advancement guarantees routing completion even in the presence of constraint loops. (The original dogleg router did not deal with constraint loops because they were removed by altering the placement of the standard cells. Such luxury is not available in the hybrid environment.) << >> 2 An introduction to channel routing may be found in [Lor88]. Implementation enhancements allow extra track ordering sequences (specifically alternating starting sides from one track to the next in addition to the starting corner specified in the original paper) and variable width branch and trunk wires. These variable width wires require that more sophisticated geometrical checks be made (both in determining constraints and in determining if a trunk wire will fit on a track. Actual design rules (metal spacings, metal widths, via-cut surround by metal, via-cut-to-via-cut) must be checked instead of the simplified rules checks that are possible when wire width for layers are uniform. Variable width track wires are dynamically assigned to staticly defined tracks. Full design rule checks must be made with other wires and with the channel edges. The conceptual enhancement of this channel routing algorithm is the manner in which constraint loops are resolved. In summary the algorithm resolves constraint loops by adding a non-pin dogleg to enough wire segments to break all of the constraint loops. Sometimes a position to insert a non-pin dogleg cannot be found within the channel. In these cases it is necessary to defer the dogleg to one of the intersecting channels. The starting point is a constraint graph where the nodes represent wire segments between two pins and the arcs represent the vertical constraints. Pins along the channel sides are investigated for vertical constraints, and arcs are added to the constraint graph. If a cycle will be formed by adding an arc, then that cycle is broken by adding a non-pin dogleg to one of the wire segments participating in the constraint loop. Each such wire segment is investigated by considering trial positions for the non-pin doglegs at pin locations and between each pair of pins. The trial wire segment and trial position for the non-pin dogleg that adds the least to channel density and wire length is chosen. This method can guarantee completion of all of the routing if the non-pin doglegs dan be deferred (pushed off the ends of the channel and hence routed in the intersecting channel). The channels are routed in such an order to guarantee that this deferring can be done [Pre79]. While this algorithm has poor worst time case performance, its average performance is quite good. Dogleg positions are usually found near their desired locations. 4.3 Hybrid Pad Frames In the top-level schematic the pad frame layout generator expects to see five schematic cells: one inner cell to be laid out as described in Section 4.2 and four cells corresponding to the four pad rows. Schematic descriptions of the pad rows are usually composed of abuts of hybrid bonding pad cells as shown in Figure 2. However, the full power of the design system is available for more complex pad row definitions. In practice, a schematic template is available to simplify pad ring definition. The layout generation procedure uses a method similar to [McG87] to route from the pad frame to the inner cell. However, global routing at this level is not required since the layout generator described in Section 4.2 guarantees that each hybrid bonding pad has a pin on the inner cell just across the adjacent routing channel. A trapezoidal routing channel is constructed on each side of the inner cell by using the detailed router described in Section 4.2.4. Triangular routing obstructions are introduced at the ends of the channels to form the trapezoidal channels. The blue (hybrid bonding pads) and the yellow (trapezoidal routing channels) regions are constructed by this layout generator. The position for the inner cell is determined by estimating the width of the four channels (as approximated by track density). Any space in excess of the required channel widths is divided equally among the channels. 5.0 Results We have used the automatic layout system for the hybrid packages to layout an IO subsystem of a high performance workstation. The function of this IO subsystem is to provide the interface among two unidirectional busses and a synchronous, high bandwidth bus designed to address the requirements of data consistency in a shared memory multiprocessor system. This IO subsystem includes four bus interface chips, a cache memory, an address mapping cache, a display controller, and an IO bus translation chip. The relative placement of the ICs on the package is functionally constrained. For instance, the four bus interface chips connect to the hybrid IO pads and hence, are constrained to be close to the boundary of package. The pin positions and orientations of ICs have been optimized so as to minimize the interconnection lengths. Two 68 bit buses connect to the bus interface chips to the IO pads and several hundred nets connect pins on the ICs. The package has a fixed area of 2.6" by 3.2" inches, and a pad-to-pad spacing of 635 has two signal layers with a trace pacing of 50 different sizes but are wire bonded to one of three standard-size die mounts. Each IC was generated by invoking a layout generator on its source schematic. The hybrid layout procedure used cached layout results for these individual IC's. The user specified the placement graphically through positions of the IC icons on the schematic. The hybrid area was reduced by 5% as a result of global route improvement. This improvement is good considering the optimized placement and pin locations of the ICs on the hybrid package. The active cell area on the hybrid package accounts for 25% of the total area which indicates a packaging efficiency of greater than 25%. (This figure does not include the contribution of wiring on the hybrid.) If we were to implement this circuit using single chip packages, a PCB trace spacing of 25 mils (635 with 10 signal layers, a total area of 4.7" by 9.8" inches would be required to implement the inner object. This area is 8.2 times the size of the inner object and 5.5 times the size of the entire hybrid package. Note to reviewers: The final version of the paper will present other hybrid packages with higher packaging efficiencies. 6.0 Summary In high speed computer systems, performance and density limits are being set more by interconnect and packaging constraints than by transistor switching speeds. Multi-chip silicon-on-silicon hybrid packages can significantly improve performance by eliminating single chip packaging. This paper describes the hybrid technology, the design automation system foundation, and a hybrid layout system capable of automatically laying out circuits on hybrid packages. The layout system is implemented as three layout generation procedures: die mount pattern generation, a general cell-like placement and routing procedure, and a pad frame generator. System operation is illustrated through a comprehensive example: a high performance workstation IO subsystem containing 8 VLSI circuits implemented on a 2.6" by 3.2" hybrid package. This method produces layouts that are 5 to 8 times more dense than the same circuit implemented with single-chip packages on a printed circuit board. Simulations show that clock speeds can be increased by a factor of two. References [Bal85] Balderes, D., and M. L. White, "Package effects on CPU performance of a large commercial processor," Proc. of Electronics Components Conference, 1985. [Bar88a] Barth, R., and B. Serlet, "A structural representation for VLSI design," Proc. of the 25th Design Automation Conference, June 1988. [Bar88b] Barth, R., B. Serlet, and P. Sindhu, "Parameterized schematics," Proc. of the 25th Design Automation Conference, June 1988. [Bar88c] Barth, R., L. Monier, and B. Serlet, "Patchwork: layout from schematic annotations," Proc. of the 25th Design Automation Conference, June 1988. [Bog87] E. Bogatin, "Beyond printed wiring board densities: a new commercial multichip packaging technology" Raychem Corporation Internal Report, May 1987. [Cho85] Chow, C. S., Phoenix: Interactive Hierarchical Topological Floorplanning Placer, Masters Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1985. [Deo74] Deo, N., Graph Theory with Applications to Engineering and Computer Science, Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1974. [Deu76] Deutsch, D. N., "A `dogleg' channel router," Proc. 13th Design Automation Conference, pp. 425-433, June, 1976. [Joh86] Johnson, R. W., J. L. Davidson, R. C. Jaeger, and D. V. Kerns, "Silicon hybrid wafer-scale package technology," IEEE Journal of Solid-State Circuits, vol. SC-21, no. 5, pp. 845-851, October 1986. [Lor88] Lorenzetti, M. J., and D. S. Baeder, in Physical Design Automation of VLSI Systems, Preas, B. T., and M. J. Lorenzetti, editors, Menlo Park, California, Benjamin-Cummings Publishing Company, Inc., 1988" [McG87] McGehee, R. K., "A practical moat router," Proc. of 24th Design Automation Conference, pp. 216-221, June 28-July 1, 1987. [Pre79] Preas, B. T., and W. M. vanCleemput, "Routing Algorithms for Hierarchical IC Layout," Proc. of the Intl. Symposium on Circuits and Systems, pp. 482-485, July, 1979. [Pre85] Preas, B. T., and C. S. Chow, "Placement and routing algorithms for topological integrated circuit layout," Proc. of the Intl. Symposium on Circuits and Systems, pp. 17-20, 1985. [Ser86] B. Serlet, "Object-Oriented Programming in Cedar." Actees des Journees Languages Orientes Objet, Bigre Globule, pp. 64-68, January 1986. Figure 1. This hybrid layout was automatically generated from the schematics shown in Figures 2, 3, and 4. This circuit is the IO subsystem of a high performance workstation and contains 8 VLSI circuits. Figure 2. This top level schematic defines the hybrid pad ring and the pad ring routing requirements. The layout (shown in Figure 1) corresponding to the blue and yellow portions of Figure 5 is generated from this schematic. Figure 3. This second level schematic defines the main placement and routing problem (called the inner cell) for the hybrid. Figure 4. This example shows the icon (interface summary), the schematic and the layout for one of the integrated circuit die mounts. The die mount layout was generated from the schematic and the layout of the integrated circuit. Figure 5. This shows the areas of the layout of Figure 1 that were generated by the three layout generation procedures described in the text. The blue and yellow portions represent the top level layout. The green die mounts were generated by the lowest level layout procedure. The white area represents the routing of the inner cell. Figure 6. This shows the detailed construction of the die mount patterns. A typical value of w is 0.1 mm. Taps to the Vdd and Ground planes are automatically inserted as required. Signal pins are touch to the periphery of the pattern for routing. Figure 7. This Channel Intersection Graph corresponds to the layout of Figure 1. The red lines represent the routing channels. Figure 8. This horizontal channel position graph corresponds to the layout of Figure 1 and is used to compute horizontal dimensions of the layout.