SCTiming.tioga
Written by: Pradeep Sindhu, March 24, 1986 1:42:03 pm PST
Last Edited by:
Pradeep Sindhu, August 18, 1986 5:25:26 pm PDT
DRAGON SMALL CACHE TIMING MODEL AND ARRAY THYME SIMULATIONS
DRAGON SMALL CACHE TIMING MODEL AND ARRAY THYME SIMULATIONS
DRAGON SMALL CACHE TIMING MODEL AND ARRAY THYME SIMULATIONS
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
Dragon Small Cache
Timing Model and Thyme Simulations
Release as [Indigo]<Dragon>SmallCache>Documentation>SCTiming.tioga, .press
© Copyright 1985 Xerox Corporation. All rights reserved.
Abstract: This document describes the timing model used in the small cache's design. It also contains the calculations, assumptions, and Thyme simulations needed to support this model.
XEROX Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, California 94304
For Internal Xerox Use Only
Contents
Introduction
Appendix A: CAM Bit Line Capacitance
Appendix B: CAM Match Line Capacitance
Appendix C: RAM Bit Line Capacitance
Introduction
This document derives the timing model used in the design of the small cache. The assumptions, calculations, and simulations used in coming up with the model are also contained herein.
Array Timing
We assume that the array CAM is two ported to avoid P/M conflicts, while the RAM is single ported. Single porting the RAM drastically reduces the area without cutting down on performance because M accesses to the RAM are relatively rare. Given this structure there are two major alternatives for the timing of the array:
A. VCAM and RCAM both match on PhA, RAM read/write on PhB.
B. VCAM match on PhA, RCAM match on PhB, RAM read/write on PhB.
Alternative A. suffers from two problems. First, since RCAM and VCAM match lines will be discharged on the same phase, the peak currents will be twice those for alternative B. Second, giving priority to the M side for RAM accesses is more complicated: to prevent the P side accessing the RAM when the M side wants it we would need to gate the connection of VMatch to RAMSelect with nArrayRMatch. This signal is the OR of all the RMatch lines, and takes longer to compute than any of the match lines. Thus we end up putting extra computation in the critical path (from address available to RAMSelect)—something that is undesirable. Its possible to avoid this critical path by slowing down the M side accesses by one more cycle, as follows
PhA RMatch, VMatch
PhB PSide gets RAM
PhA
PhB MSide gets RAM (one cycle later)
This is fine for a slow M-Bus, but unacceptable for a faster one.
Thus the timing for the array will be as follows:
PhA VMatch, prech RAM bit lines, prech RCAM bit lines
PhB RMatch, read/write RAM, prech VCAM bit lines
Appendix A: CAM Bit Line Capacitance
There are two contributions to the CAM bit line capacitance: a capacitance attached directly to the bit line, and a capacitance connected to the bit line via an on transistor. The directly attached capacitance is fixed, while the connected capacitance depends on the number of transistors in the array that happen to be on.
Directly Connected Capacitance (Per Cell) C
d
70mx4m metal2 wire (0.01)*70/100=0.007pF
55.5m2 diffusion area (1.5 E-4)*55.5=0.008pF
38m diffusion sidewall (3 E-4)*38=0.01pF
Total capacitance 0.025pF
Capacitance Connected Via 4/2 N-type Transistor (9K Ohms) C
c
50m2 diffusion area (1.5 E-4)*50=0.008pF
44m diffusion sidewall (3 E-4)*50=0.015pF
9mx4m metal1 wire (0.01)*9/100=0.001pF
54m2 poly area (0.78 E-4)*54=0.004pF
16m2 gate capacitance (11.8 E-4)*16=0.018pF
9m2 gate capacitance (on transistor) (11.8 E-4)*9=0.011pF
Total capacitance 0.057pF
Total Impedance for L Cache Lines
Let L be the number of cache lines and N be the number of 4/2 N-type transistors that are on. Then the total direct capacitance is L/2*Cd, the total connecting resistance is 9000/N, and the total connected capacitance is N*Cc. The number of on transistors N satisfies 0dNdL/2. For the total direct capacitance, we're assuming a CAM-RAM-CAM layout, where the height of the array is L/2 times the height of a CAM cell (or L times the height of a RAM cell).
For 150 lines the figures are:
direct capacitance 1.9pF
connected capacitance 0 to 4.3pF
connection resistance to 120Ohms
Appendix B: CAM Match Line Capacitance
The match line runs through 32 cells where it is connected to something and flies over 32 other cells.
Capacitance for Connected Cells
28mx3m metal1 wire (0.01)*28/100=0.0028pF
64m2 diffusion area (1.5 E-4)*64=0.01pF
52m diffusion sidewall (3 E-4)*52=0.016pF
Total capacitance 0.028pF
Capacitance for FlyOver Cells
28mx3m metal1 wire (0.01)*28/100=0.0028pF
Total capacitance 0.0028pF
Total Capacitance for Match Line
Total capacitance is 32*0.028+32*0.0028pF=0.98pF.
Appendix C: RAM Bit Line Capacitance
The computation below assumes that the piece of diffusion covering the via is removed to make the capacitance as small as possible. This increases the height of the cell by about a micron or so but decreases the capacitance by 30%.
Bit Line Capacitance (per cell) C
36.5mx4m metal2 wire (0.01)*70/100=0.007pF
20m2 N-diffusion area (1.5 E-4)*20m2=0.003pF
18m N-diffusion sidewall (3 E-4)*18m=0.005pF
Total capacitance 0.015pF
Total Capacitance for L Cache Lines
The the total capacitance is L*C.
For 150 lines the figure is:
bit line capacitance 2.3pF