Automatic Layout of
Silicon-On-Silicon Hybrid Packages 


Abstract
In high speed computer systems, performance and density limits are being set more by interconnect and packaging constraints than by 
transistor switching speeds.  The most severe limitation comes from the single-chip packages that carry the VLSI circuits.  
Multi-chip, silicon-on-silicon hybrid packages can significantly improve performance by eliminating this level of packaging.
A system has been developed to automatically generate hybrid layouts given a schematic description of the required interconnection 
and layouts of the VLSI circuits to be included.  This paper describes the hybrid technology, the design automation system 
foundation, and the hybrid layout system.  The layout system is implemented as three layout generation procedures: die mount 
pattern generation, a general cell-like placement and routing procedure, and a pad frame generator.
State-of-the-art algorithms are described but the primary contribution describes how the algorithms are combined to form an 
automatic layout system for this improved technology.  System operation is illustrated through a comprehensive example: a high 
performance workstation IO subsystem containing 8 VLSI circuits implemented on a 2.6" by 3.2" hybrid package.  This layout method, 
in combination with the improved technology, produces layouts that are 5 to 8 times more dense than the same circuit implemented 
with single-chip packages on a printed circuit board.  Simulations show that clock speeds can be increased by a factor of two.
1.0  Introduction
In high performance computer systems, the contribution of packaging to machine performance is becoming increasingly important.  
System performance and density limits are now being set by interconnect and packaging constraints. Signal delays due to transistor 
switching speeds have decreased dramatically while delays due to packaging have decreased only slightly.  Demand for growing IO pin 
counts and greater packing densities along with increased performance requirements are making traditional packaging methods 
(single-chip packages on glass-epoxy printed circuit boards) obsolete.  The most severe limitation comes from single-chip packages. 
 Thus, multi-chip packages can significantly improve system performance by eliminating this level of packaging.  Multi-chip 
packages can both increase system density and reduce signal delays due to interconnect and packaging.
At Xerox PARC, researchers have developed a high density, multilayer interconnect technology using thin film metallization and 
polyimide dielectric on a wafer-size silicon substrate.  Silicon integrated circuit (IC) fabrication equipment is used to lay down 
metal and dielectric layers on a silicon wafer.  Eight to twenty (unpackaged) VLSI circuits can be mounted and interconnected on a 
rectangular area inscribed on a silicon wafer.  A single cover provides environmental protection for all of the chips.  The 
combination of multiple interconnection layers, small features, ease of testing, reliability, and high performance makes this 
silicon-on-silicon hybrid technology very attractive.
Automatic layout synthesis at this packaging level is fully as important as the automatic synthesis of VLSI circuits.  Layout 
synthesis for these hybrid packages has the characteristics of both printed circuit boards (PCBs) and silicon ICs.  However, 
existing layout systems for neither printed circuit boards nor integrated circuits were able to lay out these circuits.  PCB layout 
systems were unable to handle the high pin densities and the complexity of the circuits while IC layout systems were unable to 
generate layouts in the exact geometric shapes required.  For these reasons, and because of the similarity of this layout problem 
has to general cell layout, we decided to enhance our existing general cell layout system to produce silicon-on-silicon hybrid 
layouts.
The combination of this new fabrication technology, a relatively rich design automation environment and an existing general cell 
layout system allowed us to provide a new capability for silicon-on-silicon hybrid layout.  Section 2 provides an overview of this 
technology and discusses the aspects that are important for automatic layout.  Since the silicon-on-silicon packaging technology is 
relatively new, it requires more discussion than would normally appear.  The existing design automation system provides a 
foundation on which to build automatic layout tools and greatly influences the design of the tools which the foundation supports.  
This foundation is described in Section 3.
Section 4 describes the automatic layout system for silicon-on-silicon hybrid circuits.  The system is composed of three layout 
procedures.  The lowest level layout procedure produces die mount patterns (the leaf cells for this system) for the ICs.  Given the 
physical design of an IC (especially the bonding pad locations) and the binding of the hybrid interconnections to the bonding pads, 
patterns for die attachment and wire bonds are automatically created.  The intermediate level layout procedure places and 
interconnects the die mount patterns.  General cell layout techniques are used; this task is greatly simplified by exploiting 
topological domain algorithms.  The highest level layout procedure connects the main layout (defined by the intermediate level 
layout procedure and called the inner object) to the hybrid bonding pads.  A minimal amount of routing overhead is required to 
translate from the topological domain layout to the geometric domain of the pads.  The IO subsystem from a high performance 
workstation (composed of eight VLSI circuits) is used as an example through out the paper.  Results from the layout of this circuit 
on a hybrid are presented in Section 5.
2.0  The Silicon-On-Silicon Hybrid Technology
Packages provide the connection between ICs and the PCBs as well as environmental protection and a thermal path from the chips to 
the environment.  Hybrid packaging can greatly increase performance of computer systems, but imposes special demands on layout 
systems.  This section describes both the hybrid technology and the advantages of this technology over traditional packaging 
techniques.
One goal of improved packaging is to fit more performance into the same volume [Bog87].  Silicon-on-silicon hybrids achieve this 
improvement through higher density (more integrated circuits for a given area), increased clock frequency (through lower 
interconnect capacitance and higher phase velocity media), and increased parallelism (more circuits per unit time are accessible 
[Bal85]).  Traditional computer system packaging uses single-chip packages mounted on glass-epoxy PCBs.  The PCBs are 
interconnected using backplanes (which often use PCB technology).  The most severe limitation on system performance results from 
the use of single-chip packages.
As the number of IO pins per chip increases, the maximum bond length (and hence the inductance) increases.  Furthermore, 
single-chip packages cover correspondingly more area compared to the chip area.  The geometric effects can be measured by packaging 
efficiency, 
    
Consider a chip carrier package with an IO pin pitch of Ppackage.  Let the chip within the package have a bonding pad pitch of 
Pchip. In a pad limited package
    
If the IC bonding pad pitch is 180 (PGAs) offer a high packaging efficiency for large IO pin counts
    
<< >>
1 Package dimensions are typically given in English units while IC dimensions are typically given in metric units.  
Silicon-on-silicon hybrids must be discussed in both worlds.  We chose to follow convention at the expense of forcing the reader to 
deal with two systems of units.  We provide metric translations of English units where appropriate.
(where n is the number of IO pins), but this efficiency is not achieved for today's chips.  For example, a 400 pin PGA with 
Ppackage = 0.1 inchs (2.54 mm) and Pchip = 180 mounted on hybrid die mount patterns approaches 100% because Pchip = Ppackage.  (The pad pitch of the die mount patterns are the 
same as the IC bonding pad pitch.)  This high packaging efficiency supports the claim that a level of packaging is skipped by 
silicon-on-silicon hybrid packaging.
Another measure of the quality of a packaging technology is interconnect density, 
    
where there are M layers of interconnection with pitch Ptrace.  When Ptrace is in inches, square inch of surface area.  A high density PCB may have 10 signal layers with a trace pitch of 25 mils (635 400 inches of trace per square inch.  The silicon-on-silicon hybrid technology which has 2 signal layers with a pitch of 50 have 1016 inches of trace per square inch of area.
Wafer scale integration represents an ideal solution at this level of packaging.  The interconnections among the active circuits 
are made using the same interconnect levels as the intra-circuit connections.  However, wafer scale integration requires yields and 
process controls that are beyond those which are now available.
Silicon-on-silicon hybrid technology [Joh86] can be viewed as a step toward wafer scale integration.  This technology uses a 
wafer-size interconnect pattern to provide connections among ICs.  Standard metallization methods are used to deposit the 
interconnects over a silicon wafer.  Separately processed and tested integrated circuits are glued onto the silicon substrate at 
the appropriate places.  Wire bonds connect the integrated circuit bonding pads and the appropriate pads on the hybird.  A single 
cover protects the entire hybrid package.  Bonding pads around the periphery connect to a PCB through TAB (tape automated bonding) 
or a specially constructed socket.
Fabrication of silicon-on-silicon hybrids involves four levels of metallization separated by a polyimide dielectric.  The lowest 
metal level forms a ground plane and covers the entire surface.  Vdd is supplied through the second metal layer.  This Vdd layer 
covers the entire surface except under the die mount patterns.  The top two metal layers are used for signal interconnection.  The 
signal traces have 25 from a 5" wafer.  Such a package can hold 8 to 20 VLSI circuits and allows 420 IO pins around the periphery of the hybrid with 635 

3.0  The Design Automation Environment
The Xerox PARC design automation system provides both the user interface and the programming foundation for the hybrid layout 
system described in this paper.  The characteristics of this design system had a large impact on the design and implementation of 
the hybrid layout system and thus, merits a separate discussion.  Emphasis is placed on the layout synthesis portions of the design 
system.
The user's primary interface with the design system is via an interactive graphics editor through which the user describes his or 
her circuit as an annotated schematic [Bar88b].  A schematic describes the hierarchical structure of a cell (expressed in terms of 
subcells).  The user is free to intermix graphical and textual (procedural) specifications and to choose the most appropriate way 
to express each portion of the design.  These intermixed specifications can be viewed as net list generators [Bar88a].  When a net 
list generator is evaluated (code is executed and/or a schematic is extracted) it returns the net list of a primitive device or 
merges results from subsidiary generators into a net list and returns it.  Thus, evaluation of a single net list generator may 
cause many levels of schematic extraction and code execution.
Layout generation is strictly a batch process.  The layout of a cell is generated recursively from the layout of its subcells.  
These concepts are illustrated by an example: the layout shown in Figure 1 was generated entirely from the schematics shown in 
Figures 2, 3, and 4.
The programmer's view of the physical design system is best understood by concentrating on layout generation procedures [Bar88c].  
When a layout generator is invoked on (the schematic of) a cell it returns the object (geometric layout) of that cell.  Invoking 
the layout procedure at the top of the design hierarchy starts a chain of layout calls since generating layout at one level usually 
requires recursively obtaining the layout of the cell's children.  For efficiency, a cache is maintained so successive invocations 
of a layout generator with the same cell will return the same layout.  Each cell in a schematic for which layout is to be generated 
is annotated with a layout key.  Cells without a layout key are flattened.  (The cell's net list is merged with that of its 
parent.)  Layout keys have AttributeProcs, LayoutProcs and DecorateProcs associated with them.  The AttributeProc defines the user 
interface of the layout generator by gathering parameters from the graphical information in the schematic.  For example, an 
AttributeProc could use schematic decorations and properties to determine relative placement of subcells or the side on which a 
port becomes public.
The layout generator next invokes the LayoutProc to construct the geometric object based on the net list and parameters obtained by 
the AttributeProc.  Generally, a LayoutProc lays out the subcells by invoking their layout generation procedures.  The subcell 
layouts can then be used as part of the layout of the subject cell.
As the final step, a layout generation procedure invokes the DecorateProc to cache the object as a property on the cell and 
decorate the interface ports with graphical layout properties.  These layout properties describe the position, size, and layer of 
public pins at the bounding box of the cell.  This step also checks the consistency of the geometric layout and the cell 
definition.  For example, every public port on the cell must have a corresponding geometric pin.  At each level of the hierarchy, 
it is only necessary that a layout procedure understand the decorations of the cell's children.
The current set of layout generation procedures varies widely in complexity.  The most primitive procedures just return a copy of 
some predefined mask object (for example, a transistor, gate or flip-flop).  Higher level layout procedures transform an object's 
orientation or abut objects together into rows, columns, or arrays.  The next level of complexity for layout procedures comes with 
the addition of routing.  Cells can be stacked with interspersed routing channels or arrayed together with interspersed 
switchboxes.  A standard cell place and route procedure is available, as well as a datapath generator that allows multiple 
interleaving of individual paths.  There is also a procedure to route an object within a pad ring and one to do a general cell 
place and route.
The design system uses an object-oriented paradigm [Ser86].  A small number of data types is used to represent an annotated net 
list.  These types have property lists to allow arbitrary annotation.  Instead of a fixed set of device types, a class mechanism 
facilitates new abstractions.  Procedural translation from one abstraction to another allows new abstractions to be added easily 
[Bar88a].
The hybrid layout system described in the next section is implemented as three layout generation procedures.  The flexibility of 
this layout method made it quite easy to extend the notion of layout from an IC technology (2 metal layer CMOS) to a hybrid 
technology (4 layers of metal) while maintaining a common source schematic for all levels of simulation.
4.0  Hybrid Layout
This section describes an automatic layout system for silicon-on-silicon hybrid packages.  This system exploits the hybrid 
technology described in Section 2 and is built on the design automation system discussed in Section 3.  A bottom-up description of 
the layout process follows.  Layout is generated from a source schematic composed of at least three levels of hierarchy.  A layout 
generation procedure is provided for the three levels.  Figure 5 illustrates the areas of the layout of Figure 1 that were 
generated by the three layout generators.
The lowest level schematic cells represent the VLSI circuits (the leaf cells) and are used to generate the die mount patterns.  
These patterns provide places to attach the ICs as well as bonding pads on the hybrid that will connect to the bonding pads on the 
IC.  Section 4.1 describes this layout operation.
Above the leaf cell generators is the layout procedure that generates the main placement and routing from one or more hierarchical 
levels in the schematic; intermediate schematic cells are flattened until only one hierarchical level exists between the top level 
and the leaf cells.  This layout generator, described in Section 4.2, produces an inner object and is an adaptation of general cell 
placement and routing layout procedures.
The highest level layout procedure is described in Section 4.3 and generates the bonding pads around the periphery of the hybrid 
and routes from these pads to the pins on the inner object.  Separate layout procedures for the top level and second level 
schematics provide a clean separation of topological domain code from geometrical domain code.  Only a minimal area penalty is 
required for this separation.
4.1  Die Mount Pattern Generation
The DieMount layout procedure constructs a die mount pattern.  In effect this pattern translates from the IC technology to the 
hybrid technology.  This layout procedure is invoked once for each IC in the design.  An example is shown in Figure 4.  The icon 
summarizes the interface while the schematic defines the binding of the hybrid nets to the IC pads.  The schematic and the IC 
layout are inputs to this layout procedure.
A die mount pattern is the hybrid layout that corresponds to a particular IC.  This rectangular region is large enough to hold the 
IC die and has around its perimeter one or two rows of die mount pads.  As shown in Figure 6, bonding wires are used to connect the 
pads on the IC (die pads) to the pads on the mounting pattern (die mount pads).  Although design rules allow two bonding pad rows, 
the pad row nearest the edge of the die was constrained to contain only Vdd and Ground connections in this example.
The 4 hybrid metal layers are numbered 0 through 3.  Layer 0 (the bottom layer) is always Ground and covers the entire surface.  
Layer 1 is usually Vdd except when used to connect signals to Ground as shown in Figure 6.  Layers 2 and 3 are the signal layers.  
For the convenience of the router, all signal connections are provided at the periphery of the die mount in both signal layers.
Die mount layouts in the hybrid technology can be acquired in two of ways.  The first is to simply specify a previously constructed 
die mount from a library.  The second is to apply the die mount generator to an IC for which the die mount pattern is to be 
generated.  The die mount generator first extracts the IC layout to find the position and size of all pads.  This information, 
combined with the signal names, is then used to determine the number of rows required as well as the individual pad types and 
locations.  Figure 6 shows the various combinations of pad types and relative sizes.  A typical value of w is 0.1 mm.  The die 
mount pads are elongated to provide additional space for new bonds in case the ICs need to be replaced.
4.2  Hybrid Placement and Routing
The layout procedure that generates the main portion of the layout, the inner object, is adapted from existing general cell layout 
procedures [Pre85].  Automatic, graphical, and interactive placement techniques are described in Section 4.2.2.  Global and 
detailed routing algorithms are described in Section 4.2.3 and 4.2.4 respectively.  The placement and routing algorithms operate in 
the topological domain; the layout is converted to geometric form as the final step.  The topological model that serves as a 
foundation to the algorithms is described in Section 4.2.1.
4.2.1  The Topological Model
The unifying concept throughout the hybrid placement and routing layout generator is the topological model [Cho85, Pre85].  In this 
abstraction only the relative positions of the IC primitives and the routing areas are important.  The model automatically 
compensates for changes in the routing areas as they expand or contract to conform to the changing design.  It is necessary to 
convert to the geometric domain only to compare alternative layouts during optimization or as the final step before generating the 
masks.
The topological model reduces the complexity of hybird placement and routing by abstracting out the geometrical information 
(geometry).  Computation of geometry is deferred until needed.  The advantages of this approach are the following: removing and 
adding both components and wiring are simplified; topological operations are well defined; mapping the layout problem into the 
geometric domain is well defined and fast; routing areas can always be made just large enough to accommodate the required wiring 
since size determination is made after routing; and the complexity of dealing with masses of geometric data is obviated.
The routing areas are organized into channels, as shown in Figure 7, which are maintained throughout the layout process.  Channels 
divide the space into topological holes, each of which contains an IC.  Channel widths may vary depending on the number of wiring 
tracks required.  The channel lengths are determined by the positions of the channels they intersect at their extreme points.  
Exactly one channel is associated with each side of each IC.
Three graphs are used to model the hybrid surface: the horizontal channel position graph, the vertical channel position graph, and 
the channel intersection graph.  The channel position graphs are used to map between the topological and geometric domains while 
the channel intersection graph is used in topology modification and global routing.  The horizontal channel position graph is a 
directed acyclic graph where arcs represent the widths (horizontal dimensions) of vertical channels or the horizontal dimensions of 
the IC die mount patterns.  Figure 8 shows the horizontal channel position graph corresponding to the layout of Figure 1.  The 
vertical channel position graph is analogous.  The channel intersection graph is an undirected graph where the nodes represent the 
channel intersections and the arcs represent segments of channels between two intersections.  Figure 7 shows the arcs (in red) of 
the channel intersection graph.
Three classes of operation are necessary for the topological model: operations which map geometrical domain problems into the 
topological domain; operations which compute geometrical aspects given a topology; and operations which modify or manipulate a 
topology.
The DefineChannels operation maps a geometrical domain layout into the topological domain.  The input is a set of rectangles 
representing the ICs on a plane.  First an outline is constructed around each rectangle; the outline always contains the rectangle 
and is made as large as possible without intersecting or containing any other outlines.  Next a primary direction is chosen and all 
outlines perpendicular to this direction are collapsed along the primary direction into channels.  Finally, the remaining outlines 
are collapsed along the secondary direction.  The ICs and channels may need to be repositioned (using the Geometerize operation 
described below).
Operations are provided to compute aspects of geometry: Size finds the length of the longest path through a channel position graph 
(corresponds to height or width of hybrid).  CriticalArcs finds arcs that lie on any longest path through a channel position graph. 
 Geometerize determines positions for the channels and ICs.  All of these operations are linear in the number of arcs and nodes 
[Deo74].
Operations which modify or manipulate the topology are also provided by the topological model and are used primarily by automatic 
placement algorithms [Cho85, Pre85].  Examples of such operations are Grow which adds an IC, Shrink which removes an IC from the 
surface, and TtoL which modifies the channel intersection topology.
4.2.2  Placement
Placement procedures determine the relative positions of the ICs and the public pins of the inner object as well as organize the 
routing areas into channels.  Hybrid placement is both simpler and more complex than general cell placement within ICs.  Hybrid 
placement is simpler because fewer components are involved (8-20 ICs per hybrid compared to 10-100 general cells per IC), and the 
sizes and shapes of ICs on a hybrid are less variable than are general cells within an IC.  Hybrid placement is more complex 
because many interrelated factors determine layout quality.  Wide buses are very prevalent; propagation delays and even power 
distribution are much more important.
Two placement options are available to the user.  An automatic placement algorithm built on the topological model is available.  
This algorithm finds a constructive initial placement, then improves the placement by iteration.  This algorithm is described in 
[Pre85].
The second option uses graphical hints from the schematic to determine a floorplan.  This option is popular because it provides a 
straightforward way for the user to specify a (topological) placement for a small number of ICs.  In hybrid layouts there is often 
only one correct placement and that placement is obvious to the user.  IC pinouts as well as public pads on the hybrid are often 
defined to accommodate a particular bus structure that further constrains placement.
The procedure to support this second placement option is minimal given the topological model and the design system foundation.  The 
DefineChannels operation from the topological model is invoked on the geometric domain layout defined by the IC icons in the source 
schematic.  After the routing channels have been determined, actual layouts for the IC die mounts are substituted for the iconic 
representations.  Routing channel topology may be modified interactively using the topology manipulation primitives of the 
topological model.
The placement for the layout shown in Figure 1 was obtained by the second placement option (without interactive modification).  
Note the correspondence of the positions of the IC icons in Figure 4 to the die mount patterns in Figure 1.
4.2.3  Global Routing
Global routing assigns segments of each net to a set of routing channels so as to reduce hybrid area and shorten connection 
lengths.  Global routing assigns net segments only to channels and does not specify coordinates within the channels.  Global 
routing for silicon-on-silicon hybrid packages differs in several important ways from general cell global routing.  Vdd at Ground 
signals do not complicate global routing because these signals are distributed on separate layers, and taps to the power supply 
layers are easy to make.  However, some problems are more complicated: overall dimensions must be tightly controlled (to fit within 
the package) and propagation delays must be carefully controlled.
Our global router operates on a channel intersection graph (CIG).  The problem of determining the global routing for a single 
interconnection net is mapped into finding a minimum Steiner tree through the channel intersection graph.  The interaction of this 
net with other nets and with the hybrid surface is captured through the use of appropriate arc weights.  The algorithm to route a 
two-pin net is equivalent to finding a shortest path between two nodes in CIG.  For a multi-pin net the global routing of the net 
is more complex.  The complexity of the problem grows rapidly as the number of pins to be connected is increased.
The initial global route algorithm works as follows.  Nets are sequentially selected for routing, and pins of the each net are 
mapped as new nodes onto the CIG.  Arc weights are assigned according to the geometric lengths of the arcs.  One pin is selected as 
the source and symmetric expansion is performed until any one of the other pins of the net is encountered.  This subtree is then 
used as the source of the next expansion, and the process continues until all the pins in the net are visited.  This approach 
guarantees that the path found between any two pins is of minimum length.  However, it does not guarantee a minimum Steiner tree.  
In case of electrically equivalent pins, we insert all the pins into CIG, but as soon as one is included in the Steiner tree for 
the net, the rest are no longer considered as targets for expansion.  In this way, the global router can make use of equivalent 
pins to minimize the routing length of a net.
Notice that the initial global router need not address the order in which net are processed since each net is routed ignoring all 
other nets in the design.  However, this may lead to a layout that is larger than necessary.  An improvement phase follows which 
minimizes hybrid area by selecting and rerouting critical nets.  The idea is to start with the initial global route and improve it 
by repeatedly selecting critical nets, removing them, and reconnecting them by alternate paths.  Critical nets are those nets 
which, if removed, make the layout smaller.  These nets pass through positions of maximum density along critical channels.  A 
critical channel is a channel will allow the size of the layout to be reduced if its width is decreased.  The arcs corresponding to 
critical channels lie on a longest path through a channel position graph.  The global route improvement algorithm is outlined in 
Figure 9.

FOR each net DO
   save current global information for this net;
   remove global route information for this net;
   compute ranges of maximum track density along all channels;
   determine channel widths based on maximum track densities;
   assign coordinates to channel position graphs (Geometerize);
   identify critical CIG arcs (CriticalArcs), and update their arc weights;
   insert pins of this net into CIG, and reroute it (ShortestPath);
   compute new hybrid area (Size);
   IF new hybrid area is reduced THEN accept;
   ELSE IF new hybrid area unchanged THEN
       IF number of critical channels is reduced THEN accept;
       ELSE IF number of critical channels unchanged THEN
           IF length of maximum density ranges along critical channels is reduced THEN accept;
           ELSE reject;
       ELSE reject;
   ELSE reject;
   IF accept THEN update hybrid area;
   ELSE IF reject THEN restore global route information for this net;
ENDFOR;

    Figure 9.  The global route improvement algorithm reroutes nets to reduce layout area.  Topological model procedure names are shown 
    in parenthesis.

In order to determine the channel widths, we compute the maximum track density of net segments passing through channel.  We can 
consider the maximum track density to be an approximation of the actual width required for detailed routing.  Arc weight for 
noncritical CIG arcs are the geometric lengths of the associated channel segments.  For each critical CIG arc, the weight is the 
geometric length plus the half perimeter of the hybrid area.  The idea is that, for most of the nets, the maximum Steiner distance 
between two pins is less than the half perimeter of hybrid.  CIG arc weights are dynamically updated from iteration to iteration.  
The improvement phase is, however, sequential; a heuristic is employed to obtain the effect of a look ahead.  Although rerouting of 
some critical net may not reduce the hybrid area, it may increase the probability of a future reduction.  After all there may be 
more than one critical path in the position graphs.  Also, if we succeed in reducing the range of maximum track density along a 
channel, we may make one of the channel segments non-critical, or if we reduce this range to zero, we can save one track, thereby 
reducing hybrid area. 
We are investigating improvements to the global routing algorithm which use a dynamic path finding algorithm (the weights of the 
CIG arcs corresponding to channels are modified while finding the shortest path).  Better results are also expected if a parallel 
global routing approach is used.
4.2.3  Detailed Routing
The detailed router used in the hybrid layout system is based on the dogleg router [Deu76] but several extensions (both conceptual 
and in implementation) enhance the generality.  Because so many channel routing algorithms have been published, only the important 
extensions are discussed here.2  The implementation extensions imply a more sophisticated model but are built on the theory 
presented in [Deu76].  The major conceptual advancement guarantees routing completion even in the presence of constraint loops.  
(The original dogleg router did not deal with constraint loops because they were removed by altering the placement of the standard 
cells.  Such luxury is not available in the hybrid environment.)
<< >>
2 An introduction to channel routing may be found in [Lor88].
Implementation enhancements allow extra track ordering sequences (specifically alternating starting sides from one track to the 
next in addition to the starting corner specified in the original paper) and variable width branch and trunk wires.  These variable 
width wires require that more sophisticated geometrical checks be made (both in determining constraints and in determining if a 
trunk wire will fit on a track.  Actual design rules (metal spacings, metal widths, via-cut surround by metal, via-cut-to-via-cut) 
must be checked instead of the simplified rules checks that are possible when wire width for layers are uniform.  Variable width 
track wires are dynamically assigned to staticly defined tracks.  Full design rule checks must be made with other wires and with 
the channel edges.
The conceptual enhancement of this channel routing algorithm is the manner in which constraint loops are resolved.  In summary the 
algorithm resolves constraint loops by adding a non-pin dogleg to enough wire segments to break all of the constraint loops.  
Sometimes a position to insert a non-pin dogleg cannot be found within the channel.  In these cases it is necessary to defer the 
dogleg to one of the intersecting channels.
The starting point is a constraint graph where the nodes represent wire segments between two pins and the arcs represent the 
vertical constraints.  Pins along the channel sides are investigated for vertical constraints, and arcs are added to the constraint 
graph.  If a cycle will be formed by adding an arc, then that cycle is broken by adding a non-pin dogleg to one of the wire 
segments participating in the constraint loop.  Each such wire segment is investigated by considering trial positions for the 
non-pin doglegs at pin locations and between each pair of pins.  The trial wire segment and trial position for the non-pin dogleg 
that adds the least to channel density and wire length is chosen.  This method can guarantee completion of all of the routing if 
the non-pin doglegs dan be deferred (pushed off the ends of the channel and hence routed in the intersecting channel).  The 
channels are routed in such an order to guarantee that this deferring can be done [Pre79].  While this algorithm has poor worst 
time case performance, its average performance is quite good.  Dogleg positions are usually found near their desired locations.
4.3  Hybrid Pad Frames
In the top-level schematic the pad frame layout generator expects to see five schematic cells: one inner cell to be laid out as 
described in Section 4.2 and four cells corresponding to the four pad rows.  Schematic descriptions of the pad rows are usually 
composed of abuts of hybrid bonding pad cells as shown in Figure 2.  However, the full power of the design system is available for 
more complex pad row definitions.  In practice, a schematic template is available to simplify pad ring definition.
The layout generation procedure uses a method similar to [McG87] to route from the pad frame to the inner cell.  However, global 
routing at this level is not required since the layout generator described in Section 4.2 guarantees that each hybrid bonding pad 
has a pin on the inner cell just across the adjacent routing channel.  A trapezoidal routing channel is constructed on each side of 
the inner cell by using the detailed router described in Section 4.2.4.  Triangular routing obstructions are introduced at the ends 
of the channels to form the trapezoidal channels.  The blue (hybrid bonding pads) and the yellow (trapezoidal routing channels) 
regions are constructed by this layout generator.
The position for the inner cell is determined by estimating the width of the four channels (as approximated by track density).  Any 
space in excess of the required channel widths is divided equally among the channels.
5.0  Results
We have used the automatic layout system for the hybrid packages to layout an IO subsystem of a high performance workstation.  The 
function of this IO subsystem is to provide the interface among two unidirectional busses and a synchronous, high bandwidth bus 
designed to address the requirements of data consistency in a shared memory multiprocessor system.  This IO subsystem includes four 
bus interface chips, a cache memory, an address mapping cache, a display controller, and an IO bus translation chip.  The relative 
placement of the ICs on the package is functionally constrained.  For instance, the four bus interface chips connect to the hybrid 
IO pads and hence, are constrained to be close to the boundary of package.  The pin positions and orientations of ICs have been 
optimized so as to minimize the interconnection lengths.  Two 68 bit buses connect to the bus interface chips to the IO pads and 
several hundred nets connect pins on the ICs.
The package has a fixed area of 2.6" by 3.2" inches, and a pad-to-pad spacing of 635 has two signal layers with a trace pacing of 50 different sizes but are wire bonded to one of three standard-size die mounts.
Each IC was generated by invoking a layout generator on its source schematic.  The hybrid layout procedure used cached layout 
results for these individual IC's.  The user specified the placement graphically through positions of the IC icons on the 
schematic.  The hybrid area was reduced by 5% as a result of global route improvement.  This improvement is good considering the 
optimized placement and pin locations of the ICs on the hybrid package.  The active cell area on the hybrid package accounts for 
25% of the total area which indicates a packaging efficiency of greater than 25%.  (This figure does not include the contribution 
of wiring on the hybrid.)  If we were to implement this circuit using single chip packages, a PCB trace spacing of 25 mils (635 with 10 signal layers, a total area of 4.7" by 9.8" inches would be required to implement the inner object.  This area is 8.2 times 
the size of the inner object and 5.5 times the size of the entire hybrid package.
Note to reviewers: The final version of the paper will present other hybrid packages with higher packaging efficiencies.
6.0  Summary
In high speed computer systems, performance and density limits are being set more by interconnect and packaging constraints than by 
transistor switching speeds.  Multi-chip silicon-on-silicon hybrid packages can significantly improve performance by eliminating 
single chip packaging.
This paper describes the hybrid technology, the design automation system foundation, and a hybrid layout system capable of 
automatically laying out circuits on hybrid packages.  The layout system is implemented as three layout generation procedures: die 
mount pattern generation, a general cell-like placement and routing procedure, and a pad frame generator.
System operation is illustrated through a comprehensive example: a high performance workstation IO subsystem containing 8 VLSI 
circuits implemented on a 2.6" by 3.2" hybrid package.  This method produces layouts that are 5 to 8 times more dense than the same 
circuit implemented with single-chip packages on a printed circuit board.  Simulations show that clock speeds can be increased by a 
factor of two.
References
[Bal85]        Balderes, D., and M. L. White, "Package effects on CPU performance of a large commercial processor," Proc. of 
Electronics Components Conference, 1985.
[Bar88a]        Barth, R., and B. Serlet, "A structural representation for VLSI design," Proc. of the 25th Design Automation 
Conference, June 1988.
[Bar88b]        Barth, R., B. Serlet, and P. Sindhu, "Parameterized schematics," Proc. of the 25th Design Automation Conference, 
June 1988.
[Bar88c]        Barth, R., L. Monier, and B. Serlet, "Patchwork: layout from schematic annotations," Proc. of the 25th Design 
Automation Conference, June 1988.
[Bog87]        E. Bogatin, "Beyond printed wiring board densities: a new commercial multichip packaging technology" Raychem 
Corporation Internal Report, May 1987.
[Cho85]        Chow, C. S., Phoenix: Interactive Hierarchical Topological Floorplanning Placer, Masters Thesis, Department of 
Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1985.
[Deo74]        Deo, N., Graph Theory with Applications to Engineering and Computer Science, Englewood Cliffs, New Jersey: 
Prentice-Hall, Inc., 1974.
[Deu76]        Deutsch, D. N., "A `dogleg' channel router," Proc. 13th Design Automation Conference, pp. 425-433, June, 1976.
[Joh86]        Johnson, R. W., J. L. Davidson, R. C. Jaeger, and D. V. Kerns, "Silicon hybrid wafer-scale package technology," IEEE 
Journal of Solid-State Circuits, vol. SC-21, no. 5, pp. 845-851, October 1986.
[Lor88]        Lorenzetti, M. J., and D. S. Baeder, in Physical Design Automation of VLSI Systems, Preas, B. T., and M. J. 
Lorenzetti, editors, Menlo Park, California, Benjamin-Cummings Publishing Company, Inc., 1988"
[McG87]    McGehee, R. K., "A practical moat router," Proc. of 24th Design Automation Conference, pp. 216-221, June 28-July 1, 1987.
[Pre79]        Preas, B. T., and W. M. vanCleemput, "Routing Algorithms for Hierarchical IC Layout," Proc. of the Intl. Symposium 
on Circuits and Systems, pp. 482-485, July, 1979.
[Pre85]        Preas, B. T., and C. S. Chow, "Placement and routing algorithms for topological integrated circuit layout," Proc. of 
the Intl. Symposium on Circuits and Systems, pp. 17-20, 1985.
[Ser86]        B. Serlet, "Object-Oriented Programming in Cedar." Actees des Journees Languages Orientes Objet, Bigre Globule, pp. 
64-68, January 1986.


Figure 1.  This hybrid layout was automatically generated from the schematics shown in Figures 2, 3, and 4.  This circuit is the IO 
subsystem of a high performance workstation and contains 8 VLSI circuits.


Figure 2.  This top level schematic defines the hybrid pad ring and the pad ring routing requirements.  The layout (shown in Figure 
1) corresponding to the blue and yellow portions of Figure 5 is generated from this schematic.


Figure 3.  This second level schematic defines the main placement and routing problem (called the inner cell) for the hybrid.


Figure 4.  This example shows the icon (interface summary), the schematic and the layout for one of the integrated circuit die 
mounts.  The die mount layout was generated from the schematic and the layout of the integrated circuit.


Figure 5.  This shows the areas of the layout of Figure 1 that were generated by the three layout generation procedures described 
in the text.  The blue and yellow portions represent the top level layout.  The green die mounts were generated by the lowest level 
layout procedure.  The white area represents the routing of the inner cell.


Figure 6.  This shows the detailed construction of the die mount patterns.  A typical value of w is 0.1 mm.  Taps to the Vdd and 
Ground planes are automatically inserted as required.  Signal pins are touch to the periphery of the pattern for routing.


Figure 7.  This Channel Intersection Graph corresponds to the layout of Figure 1.  The red lines represent the routing channels.


Figure 8.  This horizontal channel position graph corresponds to the layout of Figure 1 and is used to compute horizontal 
dimensions of the layout.