CSL Notebook Topic To Cutters and Bakers Date June 28, 1985 From Rick Barth Location PARC Subject CrossRAM Design Organization CSL XEROX Release as [Indigo]Documentation>CrossRAM.tioga, .press Came from [Indigo]Documentation>CrossRAM.tioga Last edited by Rick Barth, June 28, 1985 1:43:59 pm PDT Abstract CrossRAM is a RAM designed as a technology evaluation vehicle for both the processing line and the design aids. Introduction The Dragon computer system being developed in CSL assumes the existence of a 2m processing capability and the design aids to map logic into masks. A test vehicle is needed to ensure these assumptions are true rather than waiting for all of the design work to be completed on some major Dragon chip. The process which is assumed has the following untested technology features: * 2m CMOS * Double level metal * Silicided poly * Buried contacts Each of these has been tested seperately, CrossRAM will test them together. In addition it will have a large number of features in order to test repeatability and yield of the process. The design aids have not had to deal with a design that has such a large number of features. CrossRAM undoubtedly will, and indeed already has, stressed them beyond capacity in some areas. This document has a number of purposes. First it is a mechanism for recording the design process so that it is easy to have a design review that can check that all the proper design procedures were followed correctly. Second, since the design is logically simple, it provides a tutorial for new designers that know how to design logic but need to learn the procedures for implementing logic in a custom MOS circuit. Last it is hoped that the technology people will be using this design for technology evaluation and it would be useful for them to have some understanding of the design process, especially where tradeoffs were made due to technology restrictions. General Description The RAM has a 33 bit data bus, intended to be used for a 32 bit data word plus parity. It has an 11 bit address bus, the 3 low order bits select a word within a row of the RAM and the 8 high order bits select which row of the RAM is accessed. There are 4 control pins: Read, Write, Access and Precharge. Read and Write should be exclusive, i.e. never asserted as the same time, as should Access and Precharge. The Write pin controls the drivers which swing the bit lines within the RAM. Precharge forces the bit lines high. Since there is no interlock within the chip to prevent Precharge and Write from going high at the same time care should be taken externally to ensure that they don't. Cell Design The following descriptions assume that you have the logic diagrams, Thyme input files, output plots and CrossRAM.dale available. There are a number of cryptic statements within each section that describe the logic page numbers, simulation file names and layout cell names of interest at that point in the description. It is unfortunately the case that the logic hierarchy does not match the layout hierarchy. The descriptions will refer to the logic hierarchy since it is somewhat cleaner and easier to follow than the other. RAM Logic: CrossRAM - RAM, p. 5 The RAM cell contains the bit array and the precharge circuit. Cell: PRECHARGESLICE The precharge circuit is just a set of transistors that restores the bit lines to the power supply between each cycle. Cell: DUALRAMBIT The RAM bit cell is a 2147H static ram cell (more or less) with the pullups removed. It would be nice if the RAM cell could be made static but with the technology we currently have that would cost about a factor of 2 in density. A high resistivity poly layer or buried contacts on both sides of the well would make the price of static memory less. The layout of the RAM cell determines the layout and circuit design of the rest of the circuit. There are several factors to consider while designing this cell: The width of the cell must allow for at least three metal lines since bit, nbit and ground must be routed through the cell. The speed of the RAM will be determined by the ratio of drive capability provided by the access transistors to the capacitance of the bit lines. The storage capability of the cell in terms of the time allowed between refresh cycles and its soft error rate due to alpha particle hits is related to the capacitance of the gates of the flip flop. The amount of intercell noise coupling will be most directly related to the quality of the ground line. First lets consider the access transistor size. Most of the bit line capacitance comes from the diffusion capacitance associated with the bit line metal to diffusion contact and the access transistor channel extension. To minimize bit line capacitance requires making the access transistor as small as possible. To maximize drive capability requires making it as large as possible. Since each word on a bit line requires the access transistor, making the transistor n times larger only increases the drive capability by n while increasing the loading by the number of words times n, in this case 256n. This argues for making the transistor as small as possible. However there is another limitation and that is the size of the metal to diffusion contact to which the access transistor is connected. Making the access transistor smaller than such a contact does not improve the capacitance situation much but does decrease the drive capability significantly. Thus the size of the access transistors are matched to the size of a metal-diffusion contact. Next consider the flip flop transitor sizes. This is easy since the cell gets more drive capability and storage capability as the size of these transistors grow. Thus they should be made as big as possible while still achieving whatever density goals have been set for the RAM. Best information available suggests that an alpha particle hit generates about 1 million electron hole pairs. The current capacitor in the cell only stores about half that number so the soft error rate will depend on how efficiently the cell collects the charge generated by an alpha hit. The error rate will be a measured parameter after building the chip since I do not know any way to model the collection efficiency. Discussion of providing a good ground plane is deferred to the section on power bus sizing. The cell layout gives the following capacitances: total bit line capacitance in picofarads _ number of cell rows*((diffusion area in square microns*diffusion capacitance in picofarads per square micron) + (diffusion side wall in microns*diffusion capacitance in picofarads per micron) + (metal over poly or diffusion*metal capacitance in picofarads per square micron)) 2.44 pF _ 128.0*((24*3.0E-4)+ (20*1.9E-4) + (168*0.48E-4)) total word line capacitance in picofarads _ number of cell columns*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron) + (metal 2 over metal area in square microns*metal 2 over metal capacitance in picofarads per square micron) + (metal 2 over poly area in square microns*metal 2 over poly capacitance in picofarads per square micron) + (metal 2 over field area in square microns*metal 2 over field capacitance in picofarads per square micron)) 6.71 pF _ 264.0*((26*0.77E-4) + (16*12.4E-4) + (16*3*0.48E-4) + (32*0.25E-4) + (36*0.14E-4)) These capacitances will determine the driver sizes in the decoder and data buffer as well as the overall speed of the RAM. Thyme: CrossRAMBitLine.thy From this simulation we can see that the precharge transistors are quite adequate. The two currents indicate what the peak currents will be. They will be useful later for determining the power bus widths. The simulations for worst case power supply current should really be run at zero degress Celsius and with fast transistor parameters but since no such data exist for the process yet we shall not bother with these additional simulations. If we use midpoint values to calculate delays then the access time of the array itself, measured from midpoint of the word line to midpoint of the bit line, is about 20ns. Also we can see that the state nodes in the RAM cell bounce about a bit during the access but they end up at the same point that they started at. The initial condition used for the high state bit was selected by successive approximation. Alternatively one could simply run the simulation for a number of cycles and determine the steady state value from that. Decoder Logic: CrossRAM - Decoder, p. 6 Cells: DECODERZERO, DECLOUIS, DECODERDRIVER, DECODERONE, DECODERLOGIC, DECODEROW The decoder consists of: * a static NAnd gate that decodes the high order 7 bits of address * a single level of tree decoding that decodes bit 7 of the address * a "domino" gate that combines the decoder output and the access clock to produce the word line. The tree decoding was used to make it easy to pitch match the decoder to the RAM. Thyme: CrossRAMPairDecoder.thy This is a simulation of the NAnd gate. The simulation says that the output pretty much tracks the address inputs when the address has a 10ns edge. One amusing thing to worry about is the intermediate nodes in the stack of N channel transistors. Since there are generally no restrictions on how the address lines may flap prior to stabilizing at the setup time before the access clock it is possible to capacitively pump these nodes to voltages that inject charge into the substrate. There are guard bands around the NAnd gate to protect against latchup due to this. Thyme: CrossRAMWordLine.thy This simulation determines the size of the word line driver transistors. I merely attempted to keep the slope of the output waveform equal to the slope of the input waveform. Data Buffer Logic: CrossRAM - Data Buffer, p. 7 Cells: TRISTATEBUFFER, RAMDECODERSLICE, RAMDECODERCHUNK The data buffer consists of a multiplexor and three tristate buffers. Two of the buffers are used to drive true and complement data onto the bit lines during a write cycle. The other is used to drive data back to the pads during a read cycle. The multiplexor selects one word out of the eight words in a RAM row. Since the access transistors in the RAM cell are only N channel the multiplexor need only have N channel transistors rather than both N and P channel. However there really should be a precharge transistor on the nodes between the multiplexor and the write buffers so that during a read cycle the buffer which sends data to the pads sees a good one rather than a voltage which is one threshold below the power supply. Thyme: CrossRAMWrite.thy The size of the tristate buffer transistors and the multiplexor transistor are determined from this simulation. The inverter size was simply chosen to be the minimum. This is also where the worst case ground current occurs. Pads Logic: CrossRAM - Pads, p. 4 Cells: BUFFERPAD, TRISTATEBUFFER, INPUTPAD, BIDIRECTIONALPAD There are three different pads that handle logic signals. All of them have the protection devices and level shifter shown in Basic Pad. Input Pad merely buffers the signal one more stage. Differential Input Pad supplies both true and complement forms of the input signal. The Bidirectional Pad has two levels of output amplification and a single stage of amplification from the level shifter to the internal data bus. Thyme: CrossRAMInput.thy The input simulation is used to determine what ratio is adequate to shift TTL levels to CMOS levels. It also determines the size of the buffer needed to drive the internal data bus. The capacitance of the internal data bus is: 0.915pF _ 3*(4352+6048+128)*0.29E-4 Thyme: CrossRAMOutput.thy This simulation is used to decide how large the output transistors should be. It is also used for determining how large the power bus that runs around the pad ring needs to be. Address Drivers Logic: CrossRAM - Address Drivers, p. 3 Cells: DEMUXINVERTER, AND3, DECODERDRIVER The low order 3 bits are inverted and then demultiplexed into eight seperate select lines that run the word select multiplexor in the data buffer. The other 8 bits are buffered and sent in differential form into the decoder. Thyme: CrossRAMAddress.thy The capacitance of an address line is: total address line capacitance in picofarads _ number of cell rows*(each address line is used in only half of the cells*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron)) + (metal over poly or diffusion*metal capacitance in picofarads per square micron)) 2.59 pF _ 128.0*(0.5*((44*0.77E-4)+ (20*12.4E-4)) + (3*43*0.48E-4)) Thyme: CrossRAMAnd3.thy The capacitance of a select line is: total select line capacitance in picofarads _ number of data bits*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron) + (metal over field*metal capacitance in picofarads per square micron)) 3.13 pF _ 33.0*(((2*(47+9+2)+16)*0.77E-4)+ (56*12.4E-4) + (3*176*0.29E-4)) The capacitance of an address inverter line is: total select line capacitance in picofarads _(number of And gates*((poly over field area in square microns*poly over field capacitance in picofarads per square micron) + (gate poly area in square microns *gate poly capacitance in picofarads per square micron))) + (metal over field*metal capacitance in picofarads per square micron) 0.387 pF _ (4*((138*0.77E-4) + (64*12.4E-4))) + (3*315*0.29E-4) Power Bus Design Electromigration and voltage drop due to peak current flow through resistors and inductors must be considered when designing the power buses. The only inductance that is considered here is the package lead inductance. Electromigration limits are computed from power dissipated by the chip at its nominal frequency of operation. This is given by 0.5CV2f. Dividing by V gives the average current as 0.5CVf. The design rule calls for no more than 1.5mA per micron of metal line width. To compute voltage drops in the power bus due to peak current flow the power bus is modeled as a tree that is rooted at the pad and branches at each major block shown on the floor plan. Each node in the tree represents a junction of lower level power buses. Each link in the tree represents the metal required to join one junction to the next. The voltage drop in each link of the tree is the sum of the currents required by each of the blocks that are leaves of the tree beneath the link multiplied by the resistance of the link. The maximum voltage allowed from any point in a bus to "real" ground, i.e. the voltage of the pin where it connects to the board, is arbitrarily chosen to be 0.75 volt. The trees for this chip are very simple. Logic: CrossRAM - Floor & Power Plan, p. 8 The capacitance figures given in the following sections are the equivalent amount of capacitance that swings every cycle. RAM The size of a ground line through the array was fixed by the cell layout. Only major problems should be allowed to perturb the size of this line since it impacts the overall size of the chip so significantly. Peak ground current: 0.4mA Resistance: 82.5 ohms _ 5.5E-3/4E-6 * .06 VArray: 0.033 _ 82.5 * 0.4E-3 This voltage must be deducted from the 0.5 volt budget. The figures required for the normal bus sizing algorithm are: IG RAM: 0.105 _ 0.4E-3*264 IV Precharge: 0.158 _ (1.2E-3*264)/2.0 Capacitance: 644 pF _ 264.0 * 2.44 There are 2 bit lines per bit but only half of the bit lines swing on each cycle. The Vdd current is for the precharge block. Decoder IG Decoder: 4.5E-3 IV Decoder: 4.25E-3 Capacitance: 6.71pF These figures deal with the word line only. It is assumed that the decoder logic does not consume a substantial amount of current. Data Buffer IG Data Buffer: 0.0495 _ 1.5E-3*33 IV Data Buffer: 0.0693 _ 2.1E-3*33 Capacitance: 80.5 pF _ 2.44 * 33.0 Address Drivers IG Address Drivers: 0.044 A _ (3.0E-3*8) + (2.5E-3*8) IV Address Drivers: 0.044 A _ (3.0E-3*8) + (2.5E-3*8) Capacitance: 59.9 pF _ (4.40 * 8.0) + (3.09 * 8.0) Internal Ground Bus Width Computation The lengths of the ground buses, in microns, are: LG1: 200 LG Address Drivers: 350 LG2: 400 LG Data Buffer: 5800 LG3: 100 LG Decoder: 5700 LG RAM: 6000 The currents through the branches of the ground network are: IG3 _ IG Decoder + IG RAM 0.109 _ 4.5E-3 + 0.105 IG2 _ IG3 + IG Data Buffer 0.158 _ 0.109 + 0.0495 IG1 _ IG2 + IG Address Drivers 0.202 _ 0.158 + 0.044 The voltage change from the pc board to the bonding pad due to inductance is: Inductance of pad is about 5nH. di/dt in a single array ground line is 4E4 A/s _ 0.4E-3 / 10E-9 di/dt in whole array is then 1.056e+7 A/s _ 264 * 4E4 L di/dt is then 52.8E-3 _ 5E-9 * 1.056e+7 Given the currents in the segments and the segment lengths we can pick segment widths such that the maximum voltage change is less than 0.75 volt. Assume that the voltage drop per unit length of wire should be constant. Then DVsegment _ a Lsegment. Vpad _ Vinductor +IG1Rpackage+bonding wire VG1 _ Vpad +aLG1 VG2 _ VG1 + aLG2 VG3 _ VG2 + aLG3 VG Address Drivers _ VG1 + aLG Address Drivers VG Data Buffer _ VG2 + aLG Data Buffer VG Decoder _ VG3 + aLG Decoder VG RAM _ VG3 + aLG RAM+ VArray We want the maximum of [VAddress Drivers, VData Buffer, VDecoder, VRAM] to be equal to 0.75 volts. This is easy if we just assume that one of these, e.g. VRAM, is the worst case, use it to compute a and then just check the others. VG RAM _ Vinductor + IG1Rpackage+bonding wire +aLG1 + aLG2 + aLG3 + aLG RAM + VArray a _ (VG RAM - Vinductor - IG1Rpackage+bonding wire - VArray) / ( LG1 + LG2 + LG3 + LG RAM ) a _ (0.75 - 52.8E-3 - (0.202*0.8) - 0.033) / ( 200 + 400 + 100 + 6000 ) a _ 7.501493e-5 DVG RAM _ (IG RAM LG RAM Rmetal per square) / WG RAM aLG RAM _ (IG RAM LG RAM Rmetal per square) / WG RAM WG RAM _ (IG RAM Rmetal per square) / a 83.9 _ (0.105 * 0.06) / 7.501493e-5 WG Decoder _ (IG Decoder Rmetal per square) / a 3.59 _ (4.5E-3 * 0.06) / 7.501493e-5 WG Data Buffer _ (IG Data Buffer Rmetal per square) / a 39.5 _ (0.0495 * 0.06) / 7.501493e-5 WG Address Drivers _ (IG Address Drivers Rmetal per square) / a 35.1 _ (0.044 * 0.06) / 7.501493e-5 WG3 _ (IG3 Rmetal per square) / a 87.1 _ (0.109 * 0.06) / 7.501493e-5 WG2 _ (IG2 Rmetal per square) / a 126 _ (0.158 * 0.06) / 7.501493e-5 WG1 _ (IG1 Rmetal per square) / a 161 _ (0.202 * 0.06) / 7.501493e-5 The layout would be better if WG1-3 were all 96. If they were what would WG RAM have to be? VG RAM _ Vinductor +IG1Rpackage+bonding wire +Rmetal per square (IG1LG1/WG1-3 + IG2LG2/WG1-3 + IG3LG3/WG1-3 + IG RAMLG RAM/WG RAM) + VArray WG RAM _ IG RAMLG RAM/(((VG RAM - Vinductor - IG1Rpackage+bonding wire - VArray) / Rmetal per square) - (IG1LG1/WG1-3 + IG2LG2/WG1-3 + IG3LG3/WG1-3)) 87.6 _ 0.105*6000.0/(((0.75 - 52.8E-3 - (0.202*0.8) - 0.033) / 0.06) - ((0.202*200.0/96.0) + (0.158*400.0/96.0) + (0.109*100.0/96.0))) Now what should the width of the other ground lines be? VG1 _ Vinductor +IG1Rpackage+bonding wire +Rmetal per squareIG1LG1/WG1-3 0.239 _ 52.8E-3 +(0.202*0.8) + 0.06*0.202*200.0/96.0 VG Address Drivers _ VG1 + Rmetal per square IG Address Drivers LG Address Drivers / WG Address Drivers WG Address Drivers_ Rmetal per square IG Address Drivers LG Address Drivers / (VG Address Drivers - VG1) 1.80 _ 0.06 * 0.044 * 350 / (0.75 - 0.239) VG2 _ VG1 + Rmetal per squareIG2LG2/WG1-3 0.278 _ 0.239 + 0.06 * 0.158 * 400.0 / 96.0 WG Data Buffer_ Rmetal per square IG Data Buffer LG Data Buffer / (VG Data Buffer - VG2) 36.4 _ 0.06 * 0.0495 * 5800 / (0.75 - 0.278) VG3 _ VG2 + Rmetal per squareIG3LG3/WG1-3 2.848125e-1 _ 0.278 + 0.06 * 0.109 * 100.0 / 96.0 WG Decoder_ Rmetal per square IG Decoder LG Decoder / (VG Decoder - VG3) 3.30 _ 0.06 * 4.5E-3 * 5700 / (0.75 - 2.848125e-1) From the big change in the Address Driver width we can see that applying the assumption of uniform voltage drop per unit length to different paths of the tree leads to buses that are much too wide for the shorter stubs. Internal Vdd Bus Width Computation The lengths of the Vdd buses, in microns, are: LV1: 25 LV Address Drivers: 350 LV2: 6600 LV Decoder: 5500 LV RAM: 5800 LV3: 5800 LV Data Buffer: 5800 The currents through the branches of the Vdd network are: IV3 _ IV Data Buffer 0.0693 _ 0.0693 IV RAM _ IV3 + IV Precharge 0.227 _ 0.0693 + 0.158 IV2 _ IV RAM + IV Decoder 0.231 _ 0.227 + 4.25E-3 IV1 _ IV2 + IV Address Drivers 0.275 _ 0.231 + 0.044 Assume that the voltage rise due to inductance is the same as for the ground plane, i.e. 52.8E-3. VV Data Buffer _ Vinductor + IV1Rpackage+bonding wire + aLV1 + aLV2 + aLV RAM + aLV3 + aLV Data Buffer a _ (VV Data Buffer - Vinductor - IV1Rpackage+bonding wire) / (LV1 + LV2 + LV RAM + LV3 + LV Data Buffer) 1.986264e-5 _ (0.75 - 52.8E-3 - (0.275*0.8)) / (25 + 6600 + 5800 + 5800 + 5800) WV3 _ WV Data Buffer _ (IV Data Buffer Rmetal per square) / a 209 _ (0.0693 * 0.06) / 1.986264e-5 WV RAM _ (IV RAM Rmetal per square) / a 685 _ (0.227 * 0.06) / 1.986264e-5 WV2 _ (IV2 Rmetal per square) / a 697 _ (0.231 * 0.06) / 1.986264e-5 WV1 _ (IV1 Rmetal per square) / a 830 _ (0.275 * 0.06) / 1.986264e-5 These widths are unacceptably large, closer modeling is called for. First it is known that the precharge transistors and the data buffer write or the decoder word line transistors are not on at the same time. Also it is not necessary that the Vdd bus not bounce as much as the ground bus does, lets change it to 1.0 volt. IV RAM _ IV Precharge 0.158 _0.158 IV2 _ IV RAM 0.158 _ 0.158 IV1 _ IV2 + IV Address Drivers 0.202 _ 0.158 + 0.044 VV Precharge _ Vinductor + IV1Rpackage+bonding wire + aLV1 + aLV2 + aLV RAM a _ (VV Precharge - Vinductor - IV1Rpackage+bonding wire) / (LV1 + LV2 + LV RAM) 6.322737e-5 _ (1.0 - 52.8E-3 - (0.202*0.8)) / (25 + 6600 + 5800) WV2 _ WV Precharge _ (IV Precharge Rmetal per square) / a 149 _ (0.158 * 0.06) / 6.322737e-5 WV1 _ (IV1 Rmetal per square) / a 191 _ (0.202 * 0.06) / 6.322737e-5 From the layout it is also known that the width of V1 and V2 should be the same. Suppose it is 200. What should the width of the precharge line be? VV Precharge _ Vinductor +IV1Rpackage+bonding wire +Rmetal per square (IV1LV1/WV1-2 + IV2LV2/WV1-3 + IV Precharge LV Precharge/WV Precharge) WV Precharge _ (IV Precharge LV Precharge) / (((VV Precharge - Vinductor - IV1Rpackage+bonding wire) / Rmetal per square) - (IV1LV1/WV1-2 + IV2LV2/WV1-2)) 116 _ (0.158 * 5800) / (((1.0 - 52.8E-3 - (0.202*0.8)) / 0.06) - (0.202 * 25/200 + 0.158 * 6600/200)) Assume that WV Precharge is actually 100. Now what should the width of the other Vdd lines be? IV3 _ IV Data Buffer 0.0693 _ 0.0693 IV RAM _ IV3 0.0693 _ 0.0693 IV2 _ IV RAM + IV Decoder 0.0735 _ 0.0693 + 4.25E-3 IV1 _ IV2 0.0735 _ 0.0735 VV2 _ Vinductor +IV1Rpackage+bonding wire +Rmetal per square (IV1LV1/WV1-2 + IV2LV2/WV1-2) 2.576813e-1 _ 52.8E-3 + (0.0735*0.8) +0.06 * (0.0735 * 25/200 + 0.0735 * 6600/200) WV Decoder_ Rmetal per square IV Decoder LV Decoder / (VV Decoder - VV2) 1.88 _ 0.06 * 4.25E-3 * 5500 / (1.0 - 2.576813e-1) VV RAM _ VV2 +Rmetal per square (IV RAMLV RAM/WV RAM) 4.988453e-1 _ 2.576813e-1 +0.06 * (0.0693 * 5800/100) WV3 _ WV Data Buffer _ Rmetal per square IV Data Buffer (LV Data Buffer + LV3) / (VV Data Buffer - VV RAM) 63.2 _ 0.06 * 0.0693 * (5500 + 5800) / (1.0 - 2.576813e-1) Pad Power Bus Width Computation Here we don't much care how far up the ground plane bounces nor how far down the Vdd plane bounces. We do care if they ring more than a diode threshold above the power supply or below ground since that can inject charge into the substrate and cause latchup. The output simulation indicates that each output pad requires a peak of 40mA through the Vdd plane and 45mA through the ground plane. Di/dt for each are 40mA/5ns, or 8.0e+6 A/s, and 45mA/5ns, or 9.0e+6 A/s respectively. Assume 5nH of inductance in a single pin and 7 power pins per supply. Then the ground plane will spike by .212 volts _ (5e-9/7) * (33 * 9.0e+6). This seems just fine. The maximum distance from any output pad to a Vdd or ground pad is 3.6mm. The width of these buses is 60m. This gives a resistance of 3.6 _ 0.06*3.6e-3/60e-6, and at 45mA, a voltage drop of 1.29 volts _ 8*45e-3*3.6, assuming that there are 8 output pads along the way. Final Comments The preceding analysis totally ignores electromigration because I am assuming that the huge buses that the IR drop calculations call for will excede the bus widths called for because of electromigration. The tedious but straightforward nature of the analysis cries out for automation. The assumed noise margins of 0.75 volts in the internal ground plane and 1 volt in the internal Vdd plane were pulled more or less out of thin air and no analysis of the impact on performance has been done. I believe it is true that almost any amount of bounce in the Vdd plane, or up to 2 volts of bounce in the ground plane, can be tolerated without losing data, although the performance will suffer. Beyond 2 volts of bounce in the ground plane will cause charge to be dumped into unselected words of the RAM and cause their state to be lost. It is probably true that there should be multiple internal power pads for a design such as this. Splitting the external power pads into several sections should be not harm anything and that would allow the internal pads to be spaced around the chip, thereby reducing the ridiculously large, although possibly still inadequate, power buses that were used in this design. Since the inductive rise is so small, fewer pads for the output drivers would be ok anyway. Design Aid Notes 10/17/84 Since I last updated this document I have run the chip through the Chipmonk design rule check and circuit extract program. Unfortunately it blows up after it has started extracting the main memory array. I ended up rotating the chip three times and checking it from each of the four edges. I hope that no errors were missed because of dangling nodes that might become connected. Also since the program could not extract the circuit there is no way to run Crystal or to run the layout through a switch level simulator. Testing Notes 2/1/85 The RAM is due out of the processing line 2/7. The tester is up to the point that building scope loops is easy. It appears that most of the code and hardware needed to send long sequences of vectors is in place. Still have to build a refresh circuit and a trigger circuit. The trigger circuit should have go conditioned either by global hold or global hold bar. Should put some known working rams in the socket to perform a final test of the test setup prior to testing real circuits. Should also get rid of the 1V undershoot on the channel chip outputs. An interesting set of numbers for taps are: 0, 60, 25, 7, 35, 2, 53, 52, 3, 1, 56. 2/19/85 Silicided single level parts were delivered Friday, 15 February. The tester was brought up to the point of testing a Hitachi 6116 2Kx8 static RAM a week or two ago. One problem still remains with the testing of that part. If the default address that is supplied when no read or write cycles occur is set to zero then spurious errors occur in some of the bits. If the default address is all ones then no errors occur. They are having trouble bonding CrossRAM parts because the balls at the end of the bonds are going beyond the design rule distance to unrelated circuitry. That might not be a problem with passivated parts. Optical inspection reveals the following: Part Fault A 40 bonded to bus, appears that first bond wiped out bonding pad B 11, 13, 16, 20 level shifter 36 drive low C 16 level shifter 27 ground-vdd metal 1 bridge 30 vdd 3, 4 drive low bit 0 mux has metal 1 blob D 44, 47, 48, 57, 58, 63 drive low 6 drive low, gnd, drive low E 47 drive low A quality order is then A, E, B, D, C. 2/21/85 M. Overton rebuilt the power supply cable to the RAM. It had 3 volts of noise on both the Vdd and Gnd supplies. Unifying the ground plane with the wire wrap ground plane and bypassing the Vdd lines directly to that ground plane cleaned things up substantially. Alternating address bits have their true and complement values reversed. The leftmost bit as the order (true, complement). The next has order (complement, true). This requires a change to the data pattern and error bit map procedures. The node which drives the word line buffers sinks rather rapidly when the access clock goes high and the word is not selected. Turning off the light had no effect. I suspect that the silicide leakage problem has struck again. The precharge pad is noninverting. It directly feeds the p channel transistors which pull up the bit lines. This means precharge is asserted low rather than high so I need to flip the test program about. 2/22/85 Silicide single level metal chips. Chip Failure A column bits 12 and 22, all of bit 17 B all of bit 17, plus 4 columns and one row. E two column failures, a bit failure (17?) and a partial word line failure. The mystery of the day is why a gray pattern appears over several passes. There are random single bit failures in several, if not all, chips. During testing of the poly chips something shorted and started melting the socket. C did the melting, D felt hot so I removed it quickly, then I inserted A, it appeared to work the same as when I first tested it then, just as I was about to remove it, it became very hot and started getting lots of errors. Is this latchup in action? 2/25/85 Should put the device under test on a seperate power supply with a current limit just above what the chip nominally takes. Need to slow down the tester. Replace the crystal with a pulse generator and/or make each RAM cycle take multiple tester cycles by changing the CycleRAM code. Need to confirm the address mapping from silicon to bit map. 3/5/85 DLM/Polycide A, B and C appear to have shorts. The data line gets clamped and the inputs are degraded. DLM/Poly A shows no signs of life but is not shorted, B gets the scope loop right but the error map appears to only have working bits where the DLM word lines strap to the poly, C almost gets the scope loop right, but not quite. Version 2 General Description It has been decided that the risks associated with a dynamic RAM are too large. Use of a static RAM makes buried contacts unnecessary. The second version of the RAM eliminates buried contacts and is half static. The dynamic portion has been kept to check for leakage problems in processing. Cell Design RAM The RAM cell has been changed to the static cell developed by myself, Jeff Hoel and Pradeep Sindhu. It does not run ground vertically through the array and the pullup transistors in the cell can cause additional ground current during an access cycle. Pads The output pad drivers in the original design do not ensure that the P and N channel output transistors do not come on at the same time. This can cause a nasty current spike. Jim Gasbarro changed the bidirectional pad so that the nand gate which drives the pullup pulls up faster than the nor gate driving the n channel can pull up and the nor gate pulls down faster than the nand gate can. Power Bus Design RAM The change in the RAM cell caused changes in the peak ground current and in the ground bus construction. The peak current is now .35mA instead of .3 mA during a write and .45 mA instead of .4mA during a read. However there is only one vertical ground line per 8 bit cells instead of one for every bit cell. The large ground bus width is clearly adequate since the number of bit columns has been more than cut in half and so the size of the capacitor which must be discharged has also been cut in half. The bounce in the vertical ground line is now 0.297 _ 82.5 * 8 * 0.45E-3 since the ground current of each cell has gone up and there are 8 cells per ground line. The currents through the branches of the ground network are: IG3 _ IG Decoder + IG RAM 0.0585 _ 4.5E-3 + 0.054 IG2 _ IG3 + IG Data Buffer 0.108 _ 0.0585 + 0.0495 IG1 _ IG2 + IG Address Drivers 0.152 _ 0.108 + 0.044 The voltage change from the pc board to the bonding pad due to inductance is: Inductance of pad is about 5nH. di/dt in a single array ground line is 45E3 A/s _ 0.45E-3 / 10E-9 di/dt in whole array is then 5.4e+6 A/s _ 120 * 45E3 L di/dt is then 0.027 _ 5E-9 * 5.4e+6 WG RAM _ IG RAMLG RAM/(((VG RAM - Vinductor - IG1Rpackage+bonding wire - VArray) / Rmetal per square) - (IG1LG1/WG1-3 + IG2LG2/WG1-3 + IG3LG3/WG1-3)) 76.31197 _ 0.054*6000.0/(((0.75 - 0.027 - (0.152*0.8) - 0.297) / 0.06) - ((0.152*200.0/96.0) + (0.108*400.0/96.0) + (0.0585*100.0/96.0))) So things are wonderful, the ground plane should bounce even less than it did in the previous design. To Be Done Design The tree decoder does not pass good highs to the word driver. Either add another P channel in parallel with the N or get rid of the tree decoding. Testing Add capability to save and restore named patterns for scope loops. Add button that sorts error rows and columns by number of errors on them. Have number to print parameter. Print number of errors and row or column number. Walk down rows checking that bit map reflects geometry correctly. Build write/read error trigger circuit. Get a digital scope and track down where the gray pattern is coming from. Put error bit map on color display. Make read one as an error a color, read zero as an error another color. Documentation Run Crystal. Post layout verification of simulation results, make a monster simulation showing the critical path through the whole chip, do a read and a write cycle, include the ground and Vdd plane resistors. Write the Rosemary structural description OR write a program to analyze the Sil drawings and produce the appropriate calls to RoseCreate. Write the Rosemary test program. ��y� "Cedar" style�Icenter�MarkcenterHeader��bl�ImemoHead��s��,L��L��(Ilogo��Iabstract�� 5� �.��(��N��t�o�xhead�Ibody��K�gp�g��P��L�LIitem�� Q��Q��Q��Icontinuation��P��P��P��P��Q�P��>�>Q�Q�P��v�vQ�Q�P��P��Q�{Q��Q��Q�gP��P��P�[P�1Q��Q��:�:Q��Q��\�\R�zQ�Q�P��P��Q�Q�PP��Q��B�BQ��C�CQ�aR��Q�QQ�Q��P��Q�Q��P��Q�#Q�7P��Q�Q��P��Q�Q�