[_CD4_]<dragon7.0>SmallCache>OldStuff>OldSCTiming.tioga!1

SCTiming.tioga

Written by: Pradeep Sindhu, March 24, 1986 1:42:03 pm PST

Last Edited by:

Pradeep Sindhu, August 18, 1986 5:25:26 pm PDT

DRAGON SMALL CACHE TIMING MODEL AND ARRAY THYME SIMULATIONS

DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY

Dragon Small Cache

Timing Model and Thyme Simulations

Release as [Indigo]<Dragon>SmallCache>Documentation>SCTiming.tioga, .press

Abstract: This document describes the timing model used in the small cache's design. It also contains the calculations, assumptions, and Thyme simulations needed to support this model.

XEROX Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, California 94304

For Internal Xerox Use Only

Contents

Introduction

Appendix A: CAM Bit Line Capacitance

Appendix B: CAM Match Line Capacitance

Appendix C: RAM Bit Line Capacitance

Introduction

This document derives the timing model used in the design of the small cache. The assumptions, calculations, and simulations used in coming up with the model are also contained herein.

Array Timing

We assume that the array CAM is two ported to avoid P/M conflicts, while the RAM is single ported. Single porting the RAM drastically reduces the area without cutting down on performance because M accesses to the RAM are relatively rare. Given this structure there are two major alternatives for the timing of the array:

A. VCAM and RCAM both match on PhA, RAM read/write on PhB.

B. VCAM match on PhA, RCAM match on PhB, RAM read/write on PhB.

Alternative A. suffers from two problems. First, since RCAM and VCAM match lines will be discharged on the same phase, the peak currents will be twice those for alternative B. Second, giving priority to the M side for RAM accesses is more complicated: to prevent the P side accessing the RAM when the M side wants it we would need to gate the connection of VMatch to RAMSelect with nArrayRMatch. This signal is the OR of all the RMatch lines, and takes longer to compute than any of the match lines. Thus we end up putting extra computation in the critical path (from address available to RAMSelect)—something that is undesirable. Its possible to avoid this critical path by slowing down the M side accesses by one more cycle, as follows

PhA RMatch, VMatch

PhB PSide gets RAM

PhA

PhB MSide gets RAM (one cycle later)

This is fine for a slow M-Bus, but unacceptable for a faster one.

Thus the timing for the array will be as follows:

PhA VMatch, prech RAM bit lines, prech RCAM bit lines

PhB RMatch, read/write RAM, prech VCAM bit lines

Appendix A: CAM Bit Line Capacitance

There are two contributions to the CAM bit line capacitance: a capacitance attached directly to the bit line, and a capacitance connected to the bit line via an on transistor. The directly attached capacitance is fixed, while the connected capacitance depends on the number of transistors in the array that happen to be on.

Directly Connected Capacitance (Per Cell) Cd

70mx4m metal2 wire (0.01)*70/100=0.007pF

55.5m2 diffusion area (1.5 E-4)*55.5=0.008pF

38m diffusion sidewall (3 E-4)*38=0.01pF

Total capacitance 0.025pF

Capacitance Connected Via 4/2 N-type Transistor (9K Ohms) Cc

50m2 diffusion area (1.5 E-4)*50=0.008pF

44m diffusion sidewall (3 E-4)*50=0.015pF

9mx4m metal1 wire (0.01)*9/100=0.001pF

54m2 poly area (0.78 E-4)*54=0.004pF

16m2 gate capacitance (11.8 E-4)*16=0.018pF

9m2 gate capacitance (on transistor) (11.8 E-4)*9=0.011pF

Total capacitance 0.057pF

Total Impedance for L Cache Lines

Let L be the number of cache lines and N be the number of 4/2 N-type transistors that are on. Then the total direct capacitance is L/2*Cd, the total connecting resistance is 9000/N, and the total connected capacitance is N*Cc. The number of on transistors N satisfies 0dNdL/2. For the total direct capacitance, we're assuming a CAM-RAM-CAM layout, where the height of the array is L/2 times the height of a CAM cell (or L times the height of a RAM cell).

For 150 lines the figures are:

direct capacitance 1.9pF

connected capacitance 0 to 4.3pF

connection resistance to 120Ohms

Appendix B: CAM Match Line Capacitance

The match line runs through 32 cells where it is connected to something and flies over 32 other cells.

Capacitance for Connected Cells

28mx3m metal1 wire (0.01)*28/100=0.0028pF

64m2 diffusion area (1.5 E-4)*64=0.01pF

52m diffusion sidewall (3 E-4)*52=0.016pF

Total capacitance 0.028pF

Capacitance for FlyOver Cells

28mx3m metal1 wire (0.01)*28/100=0.0028pF

Total capacitance 0.0028pF

Total Capacitance for Match Line

Total capacitance is 32*0.028+32*0.0028pF=0.98pF.

Appendix C: RAM Bit Line Capacitance

The computation below assumes that the piece of diffusion covering the via is removed to make the capacitance as small as possible. This increases the height of the cell by about a micron or so but decreases the capacitance by 30%.

Bit Line Capacitance (per cell) C

36.5mx4m metal2 wire (0.01)*70/100=0.007pF

20m2 N-diffusion area (1.5 E-4)*20m2=0.003pF

18m N-diffusion sidewall (3 E-4)*18m=0.005pF

Total capacitance 0.015pF

Total Capacitance for L Cache Lines

The the total capacitance is L*C.

For 150 lines the figure is:

bit line capacitance 2.3pF