Introduction
The Dragon computer system being developed in CSL assumes the existence of a 2m processing capability and the design aids to map logic into masks. A test vehicle is needed to ensure these assumptions are true rather than waiting for all of the design work to be completed on some major Dragon chip.
The process which is assumed has the following untested technology features:
* 2m CMOS
* Double level metal
* Silicided poly
* Buried contacts
Each of these has been tested seperately, CrossRAM will test them together. In addition it will have a large number of features in order to test repeatability and yield of the process.
The design aids have not had to deal with a design that has such a large number of features. CrossRAM undoubtedly will, and indeed already has, stressed them beyond capacity in some areas.
This document has a number of purposes. First it is a mechanism for recording the design process so that it is easy to have a design review that can check that all the proper design procedures were followed correctly. Second, since the design is logically simple, it provides a tutorial for new designers that know how to design logic but need to learn the procedures for implementing logic in a custom MOS circuit. Last it is hoped that the technology people will be using this design for technology evaluation and it would be useful for them to have some understanding of the design process, especially where tradeoffs were made due to technology restrictions.
General Description
The RAM has a 33 bit data bus, intended to be used for a 32 bit data word plus parity. It has an 11 bit address bus, the 3 low order bits select a word within a row of the RAM and the 8 high order bits select which row of the RAM is accessed. There are 4 control pins: Read, Write, Access and Precharge. Read and Write should be exclusive, i.e. never asserted as the same time, as should Access and Precharge. The Write pin controls the drivers which swing the bit lines within the RAM. Precharge forces the bit lines high. Since there is no interlock within the chip to prevent Precharge and Write from going high at the same time care should be taken externally to ensure that they don't.
Cell Design
The following descriptions assume that you have the logic diagrams, Thyme input files, output plots and CrossRAM.dale available. There are a number of cryptic statements within each section that describe the logic page numbers, simulation file names and layout cell names of interest at that point in the description. It is unfortunately the case that the logic hierarchy does not match the layout hierarchy. The descriptions will refer to the logic hierarchy since it is somewhat cleaner and easier to follow than the other.
RAM
Logic: CrossRAM - RAM, p. 5
The RAM cell contains the bit array and the precharge circuit.
Cell: PRECHARGESLICE
The precharge circuit is just a set of transistors that restores the bit lines to the power supply between each cycle.
Cell: DUALRAMBIT
The RAM bit cell is a 2147H static ram cell (more or less) with the pullups removed. It would be nice if the RAM cell could be made static but with the technology we currently have that would cost about a factor of 2 in density. A high resistivity poly layer or buried contacts on both sides of the well would make the price of static memory less.
The layout of the RAM cell determines the layout and circuit design of the rest of the circuit. There are several factors to consider while designing this cell:
The width of the cell must allow for at least three metal lines since bit, nbit and ground must be routed through the cell.
The speed of the RAM will be determined by the ratio of drive capability provided by the access transistors to the capacitance of the bit lines.
The storage capability of the cell in terms of the time allowed between refresh cycles and its soft error rate due to alpha particle hits is related to the capacitance of the gates of the flip flop.
The amount of intercell noise coupling will be most directly related to the quality of the ground line.
First lets consider the access transistor size. Most of the bit line capacitance comes from the diffusion capacitance associated with the bit line metal to diffusion contact and the access transistor channel extension. To minimize bit line capacitance requires making the access transistor as small as possible. To maximize drive capability requires making it as large as possible. Since each word on a bit line requires the access transistor, making the transistor n times larger only increases the drive capability by n while increasing the loading by the number of words times n, in this case 256n. This argues for making the transistor as small as possible. However there is another limitation and that is the size of the metal to diffusion contact to which the access transistor is connected. Making the access transistor smaller than such a contact does not improve the capacitance situation much but does decrease the drive capability significantly. Thus the size of the access transistors are matched to the size of a metal-diffusion contact.
Next consider the flip flop transitor sizes. This is easy since the cell gets more drive capability and storage capability as the size of these transistors grow. Thus they should be made as big as possible while still achieving whatever density goals have been set for the RAM. Best information available suggests that an alpha particle hit generates about 1 million electron hole pairs. The current capacitor in the cell only stores about half that number so the soft error rate will depend on how efficiently the cell collects the charge generated by an alpha hit. The error rate will be a measured parameter after building the chip since I do not know any way to model the collection efficiency.
Discussion of providing a good ground plane is deferred to the section on power bus sizing.
The cell layout gives the following capacitances:
total bit line capacitance in picofarads ← number of cell rows*((diffusion area in square microns*diffusion capacitance in picofarads per square micron) + (diffusion side wall in microns*diffusion capacitance in picofarads per micron) + (metal over poly or diffusion*metal capacitance in picofarads per square micron))
2.44 pF ← 128.0*((24*3.0E-4)+ (20*1.9E-4) + (168*0.48E-4))
total word line capacitance in picofarads ← number of cell columns*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron) + (metal 2 over metal area in square microns*metal 2 over metal capacitance in picofarads per square micron) + (metal 2 over poly area in square microns*metal 2 over poly capacitance in picofarads per square micron) + (metal 2 over field area in square microns*metal 2 over field capacitance in picofarads per square micron))
6.71 pF ← 264.0*((26*0.77E-4) + (16*12.4E-4) + (16*3*0.48E-4) + (32*0.25E-4) + (36*0.14E-4))
These capacitances will determine the driver sizes in the decoder and data buffer as well as the overall speed of the RAM.
Thyme: CrossRAMBitLine.thy
From this simulation we can see that the precharge transistors are quite adequate. The two currents indicate what the peak currents will be. They will be useful later for determining the power bus widths. The simulations for worst case power supply current should really be run at zero degress Celsius and with fast transistor parameters but since no such data exist for the process yet we shall not bother with these additional simulations.
If we use midpoint values to calculate delays then the access time of the array itself, measured from midpoint of the word line to midpoint of the bit line, is about 20ns. Also we can see that the state nodes in the RAM cell bounce about a bit during the access but they end up at the same point that they started at. The initial condition used for the high state bit was selected by successive approximation. Alternatively one could simply run the simulation for a number of cycles and determine the steady state value from that.
Decoder
Logic: CrossRAM - Decoder, p. 6
Cells: DECODERZERO, DECLOUIS, DECODERDRIVER, DECODERONE, DECODERLOGIC, DECODEROW
The decoder consists of:
* a static NAnd gate that decodes the high order 7 bits of address
* a single level of tree decoding that decodes bit 7 of the address
* a "domino" gate that combines the decoder output and the access clock to produce the word line.
The tree decoding was used to make it easy to pitch match the decoder to the RAM.
Thyme: CrossRAMPairDecoder.thy
This is a simulation of the NAnd gate. The simulation says that the output pretty much tracks the address inputs when the address has a 10ns edge. One amusing thing to worry about is the intermediate nodes in the stack of N channel transistors. Since there are generally no restrictions on how the address lines may flap prior to stabilizing at the setup time before the access clock it is possible to capacitively pump these nodes to voltages that inject charge into the substrate. There are guard bands around the NAnd gate to protect against latchup due to this.
Thyme: CrossRAMWordLine.thy
This simulation determines the size of the word line driver transistors. I merely attempted to keep the slope of the output waveform equal to the slope of the input waveform.
Data Buffer
Logic: CrossRAM - Data Buffer, p. 7
Cells: TRISTATEBUFFER, RAMDECODERSLICE, RAMDECODERCHUNK
The data buffer consists of a multiplexor and three tristate buffers. Two of the buffers are used to drive true and complement data onto the bit lines during a write cycle. The other is used to drive data back to the pads during a read cycle. The multiplexor selects one word out of the eight words in a RAM row. Since the access transistors in the RAM cell are only N channel the multiplexor need only have N channel transistors rather than both N and P channel. However there really should be a precharge transistor on the nodes between the multiplexor and the write buffers so that during a read cycle the buffer which sends data to the pads sees a good one rather than a voltage which is one threshold below the power supply.
Thyme: CrossRAMWrite.thy
The size of the tristate buffer transistors and the multiplexor transistor are determined from this simulation. The inverter size was simply chosen to be the minimum. This is also where the worst case ground current occurs.
Pads
Logic: CrossRAM - Pads, p. 4
Cells: BUFFERPAD, TRISTATEBUFFER, INPUTPAD, BIDIRECTIONALPAD
There are three different pads that handle logic signals. All of them have the protection devices and level shifter shown in Basic Pad. Input Pad merely buffers the signal one more stage. Differential Input Pad supplies both true and complement forms of the input signal. The Bidirectional Pad has two levels of output amplification and a single stage of amplification from the level shifter to the internal data bus.
Thyme: CrossRAMInput.thy
The input simulation is used to determine what ratio is adequate to shift TTL levels to CMOS levels. It also determines the size of the buffer needed to drive the internal data bus. The capacitance of the internal data bus is:
0.915pF ← 3*(4352+6048+128)*0.29E-4
Thyme: CrossRAMOutput.thy
This simulation is used to decide how large the output transistors should be. It is also used for determining how large the power bus that runs around the pad ring needs to be.
Address Drivers
Logic: CrossRAM - Address Drivers, p. 3
Cells: DEMUXINVERTER, AND3, DECODERDRIVER
The low order 3 bits are inverted and then demultiplexed into eight seperate select lines that run the word select multiplexor in the data buffer. The other 8 bits are buffered and sent in differential form into the decoder.
Thyme: CrossRAMAddress.thy
The capacitance of an address line is:
total address line capacitance in picofarads ← number of cell rows*(each address line is used in only half of the cells*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron)) + (metal over poly or diffusion*metal capacitance in picofarads per square micron))
2.59 pF ← 128.0*(0.5*((44*0.77E-4)+ (20*12.4E-4)) + (3*43*0.48E-4))
Thyme: CrossRAMAnd3.thy
The capacitance of a select line is:
total select line capacitance in picofarads ← number of data bits*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron) + (metal over field*metal capacitance in picofarads per square micron))
3.13 pF ← 33.0*(((2*(47+9+2)+16)*0.77E-4)+ (56*12.4E-4) + (3*176*0.29E-4))
The capacitance of an address inverter line is:
total select line capacitance in picofarads ←(number of And gates*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron))) + (metal over field*metal capacitance in picofarads per square micron)
0.387 pF ← (4*((138*0.77E-4) + (64*12.4E-4))) + (3*315*0.29E-4)
Power Bus Design
Electromigration and voltage drop due to peak current flow through resistors and inductors must be considered when designing the power buses. The only inductance that is considered here is the package lead inductance.
Electromigration limits are computed from power dissipated by the chip at its nominal frequency of operation. This is given by 0.5CV2f. Dividing by V gives the average current as 0.5CVf. The design rule calls for no more than 1.5mA per micron of metal line width.
To compute voltage drops in the power bus due to peak current flow the power bus is modeled as a tree that is rooted at the pad and branches at each major block shown on the floor plan. Each node in the tree represents a junction of lower level power buses. Each link in the tree represents the metal required to join one junction to the next. The voltage drop in each link of the tree is the sum of the currents required by each of the blocks that are leaves of the tree beneath the link multiplied by the resistance of the link. The maximum voltage allowed from any point in a bus to "real" ground, i.e. the voltage of the pin where it connects to the board, is arbitrarily chosen to be 0.75 volt. The trees for this chip are very simple.
Logic: CrossRAM - Floor & Power Plan, p. 8
The capacitance figures given in the following sections are the equivalent amount of capacitance that swings every cycle.
RAM
The size of a ground line through the array was fixed by the cell layout. Only major problems should be allowed to perturb the size of this line since it impacts the overall size of the chip so significantly.
Peak ground current: 0.4mA
Resistance: 82.5 ohms ← 5.5E-3/4E-6 * .06
VArray: 0.033 ← 82.5 * 0.4E-3
This voltage must be deducted from the 0.5 volt budget. The figures required for the normal bus sizing algorithm are:
IG RAM: 0.105 ← 0.4E-3*264
IV Precharge: 0.158 ← (1.2E-3*264)/2.0
Capacitance: 644 pF ← 264.0 * 2.44
There are 2 bit lines per bit but only half of the bit lines swing on each cycle. The Vdd current is for the precharge block.
Decoder
IG Decoder: 4.5E-3
IV Decoder: 4.25E-3
Capacitance: 6.71pF
These figures deal with the word line only. It is assumed that the decoder logic does not consume a substantial amount of current.
Data Buffer
IG Data Buffer: 0.0495 ← 1.5E-3*33
IV Data Buffer: 0.0693 ← 2.1E-3*33
Capacitance: 80.5 pF ← 2.44 * 33.0
Address Drivers
IG Address Drivers: 0.044 A ← (3.0E-3*8) + (2.5E-3*8)
IV Address Drivers: 0.044 A ← (3.0E-3*8) + (2.5E-3*8)
Capacitance: 59.9 pF ← (4.40 * 8.0) + (3.09 * 8.0)
Internal Ground Bus Width Computation
The lengths of the ground buses, in microns, are:
LG1: 200 LG Address Drivers: 350
LG2: 400 LG Data Buffer: 5800 LG3: 100
LG Decoder: 5700 LG RAM: 6000
The currents through the branches of the ground network are:
IG3 ← IG Decoder + IG RAM
0.109 ← 4.5E-3 + 0.105
IG2 ← IG3 + IG Data Buffer
0.158 ← 0.109 + 0.0495
IG1 ← IG2 + IG Address Drivers
0.202 ← 0.158 + 0.044
The voltage change from the pc board to the bonding pad due to inductance is:
Inductance of pad is about 5nH.
di/dt in a single array ground line is 4E4 A/s ← 0.4E-3 / 10E-9
di/dt in whole array is then 1.056e+7 A/s ← 264 * 4E4
L di/dt is then 52.8E-3 ← 5E-9 * 1.056e+7
Given the currents in the segments and the segment lengths we can pick segment widths such that the maximum voltage change is less than 0.75 volt. Assume that the voltage drop per unit length of wire should be constant. Then
DV
segment ←
a L
segment.
Vpad ← Vinductor +IG1Rpackage+bonding wire
VG1 ← Vpad +aLG1
VG2 ← VG1 + aLG2
VG3 ← VG2 + aLG3
VG Address Drivers ← VG1 + aLG Address Drivers
VG Data Buffer ← VG2 + aLG Data Buffer
VG Decoder ← VG3 + aLG Decoder
VG RAM ← VG3 + aLG RAM+ VArray
We want the maximum of [V
Address Drivers, V
Data Buffer, V
Decoder, V
RAM] to be equal to 0.75 volts. This is easy if we just assume that one of these, e.g. V
RAM, is the worst case, use it to compute
a and then just check the others.
VG RAM ← Vinductor + IG1Rpackage+bonding wire +aLG1 + aLG2 + aLG3 + aLG RAM + VArray
a ← (VG RAM - Vinductor - IG1Rpackage+bonding wire - VArray) / ( LG1 + LG2 + LG3 + LG RAM )
a ← (0.75 - 52.8E-3 - (0.202*0.8) - 0.033) / ( 200 + 400 + 100 + 6000 )
a ← 7.501493e-5
DVG RAM ← (IG RAM LG RAM Rmetal per square) / WG RAM
aLG RAM ← (IG RAM LG RAM Rmetal per square) / WG RAM
WG RAM ← (IG RAM Rmetal per square) / a
83.9 ← (0.105 * 0.06) / 7.501493e-5
WG Decoder ← (IG Decoder Rmetal per square) / a
3.59 ← (4.5E-3 * 0.06) / 7.501493e-5
WG Data Buffer ← (IG Data Buffer Rmetal per square) / a
39.5 ← (0.0495 * 0.06) / 7.501493e-5
WG Address Drivers ← (IG Address Drivers Rmetal per square) / a
35.1 ← (0.044 * 0.06) / 7.501493e-5
WG3 ← (IG3 Rmetal per square) / a
87.1 ← (0.109 * 0.06) / 7.501493e-5
WG2 ← (IG2 Rmetal per square) / a
126 ← (0.158 * 0.06) / 7.501493e-5
WG1 ← (IG1 Rmetal per square) / a
161 ← (0.202 * 0.06) / 7.501493e-5
The layout would be better if W
G1-3 were all 96. If they were what would W
G RAM have to be?
VG RAM ← Vinductor +IG1Rpackage+bonding wire +Rmetal per square (IG1LG1/WG1-3 + IG2LG2/WG1-3 + IG3LG3/WG1-3 + IG RAMLG RAM/WG RAM) + VArray
WG RAM ← IG RAMLG RAM/(((VG RAM - Vinductor - IG1Rpackage+bonding wire - VArray) / Rmetal per square) - (IG1LG1/WG1-3 + IG2LG2/WG1-3 + IG3LG3/WG1-3))
87.6 ← 0.105*6000.0/(((0.75 - 52.8E-3 - (0.202*0.8) - 0.033) / 0.06) - ((0.202*200.0/96.0) + (0.158*400.0/96.0) + (0.109*100.0/96.0)))
Now what should the width of the other ground lines be?
VG1 ← Vinductor +IG1Rpackage+bonding wire +Rmetal per squareIG1LG1/WG1-3
0.239 ← 52.8E-3 +(0.202*0.8) + 0.06*0.202*200.0/96.0
VG Address Drivers ← VG1 + Rmetal per square IG Address Drivers LG Address Drivers / WG Address Drivers
WG Address Drivers← Rmetal per square IG Address Drivers LG Address Drivers / (VG Address Drivers - VG1)
1.80 ← 0.06 * 0.044 * 350 / (0.75 - 0.239)
VG2 ← VG1 + Rmetal per squareIG2LG2/WG1-3
0.278 ← 0.239 + 0.06 * 0.158 * 400.0 / 96.0
WG Data Buffer← Rmetal per square IG Data Buffer LG Data Buffer / (VG Data Buffer - VG2)
36.4 ← 0.06 * 0.0495 * 5800 / (0.75 - 0.278)
VG3 ← VG2 + Rmetal per squareIG3LG3/WG1-3
2.848125e-1 ← 0.278 + 0.06 * 0.109 * 100.0 / 96.0
WG Decoder← Rmetal per square IG Decoder LG Decoder / (VG Decoder - VG3)
3.30 ← 0.06 * 4.5E-3 * 5700 / (0.75 - 2.848125e-1)
From the big change in the Address Driver width we can see that applying the assumption of uniform voltage drop per unit length to different paths of the tree leads to buses that are much too wide for the shorter stubs.
Internal Vdd Bus Width Computation
The lengths of the Vdd buses, in microns, are:
LV1: 25 LV Address Drivers: 350 LV2: 6600
LV Decoder: 5500 LV RAM: 5800 LV3: 5800
LV Data Buffer: 5800
The currents through the branches of the Vdd network are:
IV3 ← IV Data Buffer
0.0693 ← 0.0693
IV RAM ← IV3 + IV Precharge
0.227 ← 0.0693 + 0.158
IV2 ← IV RAM + IV Decoder
0.231 ← 0.227 + 4.25E-3
IV1 ← IV2 + IV Address Drivers
0.275 ← 0.231 + 0.044
Assume that the voltage rise due to inductance is the same as for the ground plane, i.e. 52.8E-3.
VV Data Buffer ← Vinductor + IV1Rpackage+bonding wire + aLV1 + aLV2 + aLV RAM + aLV3 + aLV Data Buffer
a ← (VV Data Buffer - Vinductor - IV1Rpackage+bonding wire) / (LV1 + LV2 + LV RAM + LV3 + LV Data Buffer)
1.986264e-5 ← (0.75 - 52.8E-3 - (0.275*0.8)) / (25 + 6600 + 5800 + 5800 + 5800)
WV3 ← WV Data Buffer ← (IV Data Buffer Rmetal per square) / a
209 ← (0.0693 * 0.06) / 1.986264e-5
WV RAM ← (IV RAM Rmetal per square) / a
685 ← (0.227 * 0.06) / 1.986264e-5
WV2 ← (IV2 Rmetal per square) / a
697 ← (0.231 * 0.06) / 1.986264e-5
WV1 ← (IV1 Rmetal per square) / a
830 ← (0.275 * 0.06) / 1.986264e-5
These widths are unacceptably large, closer modeling is called for. First it is known that the precharge transistors and the data buffer write or the decoder word line transistors are not on at the same time. Also it is not necessary that the Vdd bus not bounce as much as the ground bus does, lets change it to 1.0 volt.
IV RAM ← IV Precharge
0.158 𡤀.158
IV2 ← IV RAM
0.158 ← 0.158
IV1 ← IV2 + IV Address Drivers
0.202 ← 0.158 + 0.044
VV Precharge ← Vinductor + IV1Rpackage+bonding wire + aLV1 + aLV2 + aLV RAM
a ← (VV Precharge - Vinductor - IV1Rpackage+bonding wire) / (LV1 + LV2 + LV RAM)
6.322737e-5 ← (1.0 - 52.8E-3 - (0.202*0.8)) / (25 + 6600 + 5800)
WV2 ← WV Precharge ← (IV Precharge Rmetal per square) / a
149 ← (0.158 * 0.06) / 6.322737e-5
WV1 ← (IV1 Rmetal per square) / a
191 ← (0.202 * 0.06) / 6.322737e-5
From the layout it is also known that the width of V
1 and V
2 should be the same. Suppose it is 200. What should the width of the precharge line be?
VV Precharge ← Vinductor +IV1Rpackage+bonding wire +Rmetal per square (IV1LV1/WV1-2 + IV2LV2/WV1-3 + IV Precharge LV Precharge/WV Precharge)
WV Precharge ← (IV Precharge LV Precharge) / (((VV Precharge - Vinductor - IV1Rpackage+bonding wire) / Rmetal per square) - (IV1LV1/WV1-2 + IV2LV2/WV1-2))
116 ← (0.158 * 5800) / (((1.0 - 52.8E-3 - (0.202*0.8)) / 0.06) - (0.202 * 25/200 + 0.158 * 6600/200))
Assume that W
V Precharge is actually 100. Now what should the width of the other Vdd lines be?
IV3 ← IV Data Buffer
0.0693 ← 0.0693
IV RAM ← IV3
0.0693 ← 0.0693
IV2 ← IV RAM + IV Decoder
0.0735 ← 0.0693 + 4.25E-3
IV1 ← IV2
0.0735 ← 0.0735
VV2 ← Vinductor +IV1Rpackage+bonding wire +Rmetal per square (IV1LV1/WV1-2 + IV2LV2/WV1-2)
2.576813e-1 ← 52.8E-3 + (0.0735*0.8) +0.06 * (0.0735 * 25/200 + 0.0735 * 6600/200)
WV Decoder← Rmetal per square IV Decoder LV Decoder / (VV Decoder - VV2)
1.88 ← 0.06 * 4.25E-3 * 5500 / (1.0 - 2.576813e-1)
VV RAM ← VV2 +Rmetal per square (IV RAMLV RAM/WV RAM)
4.988453e-1 ← 2.576813e-1 +0.06 * (0.0693 * 5800/100)
WV3 ← WV Data Buffer ← Rmetal per square IV Data Buffer (LV Data Buffer + LV3) / (VV Data Buffer - VV RAM)
63.2 ← 0.06 * 0.0693 * (5500 + 5800) / (1.0 - 2.576813e-1)
Pad Power Bus Width Computation
Here we don't much care how far up the ground plane bounces nor how far down the Vdd plane bounces. We do care if they ring more than a diode threshold above the power supply or below ground since that can inject charge into the substrate and cause latchup.
The output simulation indicates that each output pad requires a peak of 40mA through the Vdd plane and 45mA through the ground plane. Di/dt for each are 40mA/5ns, or 8.0e+6 A/s, and 45mA/5ns, or 9.0e+6 A/s respectively. Assume 5nH of inductance in a single pin and 7 power pins per supply. Then the ground plane will spike by .212 volts ← (5e-9/7) * (33 * 9.0e+6). This seems just fine.
The maximum distance from any output pad to a Vdd or ground pad is 3.6mm. The width of these buses is 60m. This gives a resistance of 3.6 ← 0.06*3.6e-3/60e-6, and at 45mA, a voltage drop of 1.29 volts ← 8*45e-3*3.6, assuming that there are 8 output pads along the way.
Final Comments
The preceding analysis totally ignores electromigration because I am assuming that the huge buses that the IR drop calculations call for will excede the bus widths called for because of electromigration. The tedious but straightforward nature of the analysis cries out for automation. The assumed noise margins of 0.75 volts in the internal ground plane and 1 volt in the internal Vdd plane were pulled more or less out of thin air and no analysis of the impact on performance has been done. I believe it is true that almost any amount of bounce in the Vdd plane, or up to 2 volts of bounce in the ground plane, can be tolerated without losing data, although the performance will suffer. Beyond 2 volts of bounce in the ground plane will cause charge to be dumped into unselected words of the RAM and cause their state to be lost.
It is probably true that there should be multiple internal power pads for a design such as this. Splitting the external power pads into several sections should be not harm anything and that would allow the internal pads to be spaced around the chip, thereby reducing the ridiculously large, although possibly still inadequate, power buses that were used in this design. Since the inductive rise is so small, fewer pads for the output drivers would be ok anyway.
Testing Notes
2/1/85
The RAM is due out of the processing line 2/7. The tester is up to the point that building scope loops is easy. It appears that most of the code and hardware needed to send long sequences of vectors is in place. Still have to build a refresh circuit and a trigger circuit. The trigger circuit should have go conditioned either by global hold or global hold bar. Should put some known working rams in the socket to perform a final test of the test setup prior to testing real circuits. Should also get rid of the 1V undershoot on the channel chip outputs.
An interesting set of numbers for taps are: 0, 60, 25, 7, 35, 2, 53, 52, 3, 1, 56.
2/19/85
Silicided single level parts were delivered Friday, 15 February.
The tester was brought up to the point of testing a Hitachi 6116 2Kx8 static RAM a week or two ago. One problem still remains with the testing of that part. If the default address that is supplied when no read or write cycles occur is set to zero then spurious errors occur in some of the bits. If the default address is all ones then no errors occur.
They are having trouble bonding CrossRAM parts because the balls at the end of the bonds are going beyond the design rule distance to unrelated circuitry. That might not be a problem with passivated parts.
Optical inspection reveals the following:
Part Fault
A 40 bonded to bus, appears that first bond wiped out bonding pad
B 11, 13, 16, 20 level shifter
36 drive low
C 16 level shifter
27 ground-vdd metal 1 bridge
30 vdd
3, 4 drive low
bit 0 mux has metal 1 blob
D 44, 47, 48, 57, 58, 63 drive low
6 drive low, gnd, drive low
E 47 drive low
A quality order is then A, E, B, D, C.
2/21/85
M. Overton rebuilt the power supply cable to the RAM. It had 3 volts of noise on both the Vdd and Gnd supplies. Unifying the ground plane with the wire wrap ground plane and bypassing the Vdd lines directly to that ground plane cleaned things up substantially.
Alternating address bits have their true and complement values reversed. The leftmost bit as the order (true, complement). The next has order (complement, true). This requires a change to the data pattern and error bit map procedures.
The node which drives the word line buffers sinks rather rapidly when the access clock goes high and the word is not selected. Turning off the light had no effect. I suspect that the silicide leakage problem has struck again.
The precharge pad is noninverting. It directly feeds the p channel transistors which pull up the bit lines. This means precharge is asserted low rather than high so I need to flip the test program about.
2/22/85
Silicide single level metal chips.
Chip Failure
A column bits 12 and 22, all of bit 17
B all of bit 17, plus 4 columns and one row.
E two column failures, a bit failure (17?) and a partial word line failure.
The mystery of the day is why a gray pattern appears over several passes. There are random single bit failures in several, if not all, chips.
During testing of the poly chips something shorted and started melting the socket. C did the melting, D felt hot so I removed it quickly, then I inserted A, it appeared to work the same as when I first tested it then, just as I was about to remove it, it became very hot and started getting lots of errors. Is this latchup in action?
2/25/85
Should put the device under test on a seperate power supply with a current limit just above what the chip nominally takes. Need to slow down the tester. Replace the crystal with a pulse generator and/or make each RAM cycle take multiple tester cycles by changing the CycleRAM code. Need to confirm the address mapping from silicon to bit map.
3/5/85
DLM/Polycide A, B and C appear to have shorts. The data line gets clamped and the inputs are degraded. DLM/Poly A shows no signs of life but is not shorted, B gets the scope loop right but the error map appears to only have working bits where the DLM word lines strap to the poly, C almost gets the scope loop right, but not quite.