Page Numbers: Yes  X: 527  Y: 10.5"qjk40(635)	July 18, 1979e30(0,16448)\t1PLA Speed CalculationsJim Cherrye30c(2116)\f5b24B10bFiled on: [ivy]<Cherry>PLAspeed.press, .bravoe30e10IntroductionEver wonder if your PLA can break the sound barrier ?e12jk40\f5b13BHere is a rule of thumb to see if it will.	ttotal = t[5M + 4.3N + 4.6I + 4]where	ttotal	is the PLA propagation delay	M	is the number of minterms	N	is the number of outputs	I 	is the number of inputs	t	is the process transit time, e.g. .5nse12jk40(0,5216)(1,6624)\f5 46f1o252 5f5o0 3f4 1f5 32f1o252 5f5o0 116f4 1f5This expression is independent of the value of l.  The basic attitude taken in its derivation is one of extreme conservatism.  This partially because incorporating the statistics of how one programs the PLA would greatly complicate the expression.  The PLA is assumed to be built from the standard library cells in A  Guide to LSI Implementation [Hon and Sequin].e12jk40(2116)\f5 47f4 1f5 267i30IDerivationThe basic formula used is the propagation delay for an inverter driving a capacitive node [Mead and Conway].e12jk40\f5 1b11Be12jk40\f5tr = tkCl/Cge12ck40\f5 1f1o252 1f5o0 3f4 1f5 2f0o252 1f5o0 2f0o252 1f5o0Where	tr	is the rise time for the driven node	t	is the process transit time	k	is the inverter's pull-up/pull-down ratio	Cl	is the node capacitance that the inverter is driving,	  	including both gates and strays from interconnect	Cg	is the gate capacitance of the inverterNote that this is the worst case transition, i.e. from low to high, since it includes the factor k.  In general this derivation assumes all transitions to be worst case to simplify the expressions. e12jk40(0,5216)\f5 8f1o252 1f5o0 40f4 1f5 78f0o252 1f5o0 112f0o252 1f5o0 222f0 1f5 15f0 1f5 The  calculations of Cl and Cg are done by counting squares in the PLA cells which are 1l by 1l, and multiplying by the appropriate capacitance/l2.  The capacitance constants used are enumerated in the following table.	Cg	gate-channel	Cd	diffusion	Cp	poly	Cmd	metal over diffusion	Cmp	metal over polyThe basic algorithm for calculating the total delay through the PLA is to calculate the time constant for each section and sum them to give a final propagation time constant.e12jk40(0,5152)(1,6688)\f5 24f0o252 1f5o0 6f0o252 1f5o0 58f4 1f5 5f4 1f5 49f4 1f1o4 1f5o0 76f1o252 2f5o0 15f1o252 2f5o0 12f1o252 2f5o0 7f1o252 3f5o0 23f1o252 3f5o0An input to the PLA is first buffered by a pair of series connected inverters which provide both the true and false versions of the input to the AND plane.  Each of these input buffers have the same loading (the AND plane input gates), ignoring the loading of the second inverter on the first.  Thus, a falling and rising input have approximately the same amount of delay.  Assuming that each output drives all of the minterm gates, this delay is given bye12jk40(2116)\f5tAND = t(k+1)[ (8Cg+6Cp)M ]/16Cge12ck40\f5 2f1o252 4f5o0 2f4 1f5 10f1o252 1f5o0 3f1o252 1f5o0 8f1o252 1f5o0where M is the number of minterms in the PLA.  k+1 is used because one inverter output will be rising, and the other falling.  With a normal input k=4.  If the input is driven by a pass transistor, the first inverter will have k=8, but since its pull-down doubles in width, nothing changes.  This expression includes the loading from the AND plane gates and the poly line itself.e12jk40\f5Each AND plane output must drive the strays associated with the metal line which crosses all of its inputs in addition to the gates of the OR plane.  The time constant is thuse12jk40\f5tOR = tk[ (8Cg+6Cp)N + 2{8Cmd+8Cmp+12Cd}I ]/8Cge12ck40\f5 1f1o252 2f0 1f5o0 2f4 1f5 6f1o252 1f5o0 3f1o252 1f5o0 9f1o252 2f5o0 3f1o252 2f5o0 4f1o252 1f5o0 7f1o252 1f5o0where N is the number of PLA outputs, and I is the number of PLA inputs.  Again, k=4.  This formula accounts for the gates driven in the OR plane (N) and the strays of the corresponding poly line.   Note that in the library PLA cells a diffusion flash and contact to the output metal wire appears even if there is no programming flash to connect it (a bit of a bug). The bracketed quantity {} is the capacitance per poly line that the metal output bus crosses including these diffusion flashes.  Since each AND plane output crosses two of these for each PLA input, there is a factor of two in there.e12jk40\f5An OR plane gate drives both the bracketed quantity above for each minterm, and an output buffer.  Thus the delay through this stage ise12jk40\f5tout = tk[8Cg+{8Cmd+8Cmp+12Cd}M]/8Cge12ck40\f5 1f1o252 3f0 1f5o0 2f4 1f5 4f1o252 1f5o0 4f1o252 2f5o0 3f1o252 2f5o0 4f1o252 1f5o0 6f1o252 1f5o0with k=4.Collecting all of the propagation delays, the final expression becomese12jk40\f5	ttotal = tAND+tOR+tout	ttotal = t[(2.5Cg +1.88Cp +4Cmd+4Cmp +6Cd)M +		(4Cg +3Cp)N +(8Cmd +8Cmp +12Cd)I +4Cg]/Cge12k40(0,5216)(1,6624)\f5 2f1o252 5f5o0 4f1o252 3f5o0 2f1o252 2f5o0 2f1o252 3f5o0 4f1o252 5f5o0 3f4 1f5 6f1o252 1f5o0 7f1o252 1f5o0 4f1o252 2f5o0 3f1o252 2f5o0 4f1o252 1f5o0 10f1o252 1f5o0 4f1o252 1f5o0 7f1o252 2f5o0 4f1o252 2f5o0 5f1o252 1f5o0 6f1o252 1f5o0 3f1o252 1f5o0Reasonable process values for the capacitances above are	Cg	4x10-4pf/mm2	Cd	1x10-4pf/mm2	Cp	.4x10-4pf/mm2	Cmd	.4x10-4pf/mm2	Cmp	.4x10-4pf/mm2e12jk40(0,5152)(1,6688)\f5 60f1o252 2f5o0 4f0o4 2f5o0 3f4 1f5 1f0o4 1f5o0 3f1o252 2f5o0 4f0o4 2f5o0 3f4 1f5 1f0o4 1f5o0 3f1o252 2f5o0 5f0o4 2f5o0 3f4 1f5 1f0o4 1f5o0 3f1o252 3f5o0 5f0o4 2f5o0 3f4 1f5 1f0o4 1f5o0 3f1o252 3f5o0 5f0o4 2f5o0 3f4 1f5 1f0o4 1f5o0Plugging these into the ttotal expression and noting that the units of all capacitances cancel yield the desired result.e12jk40(0,5216)(1,6624)\f5 25f1o252 5f5o0e12jk40\f5	ttotal = t[5M + 4.3N + 4.6I + 4]where	M	is the number of minterms	N	is the number of outputs	I 	is the number of inputs	t	is the process transit timee12jk40\f5 4f1o252 5f5o0 3f4 1f5 116f4 1f5Notice that the constant term of 4 may be ignored for a reasonably sized PLA.  A typical value for t is about .5ns, and with a good HMOS process it may be as low as .2ns.  Note if the geometries are scaled this expression is still valid, since it is the ratio of load capacitance to driver gate capacitance that determines all of the constants involved.e12jk40(2116)\f5 99f4 1f5