1. Memory Control
Page Mode vs. Nibble Mode Access
When accessing a bit in a DRAM the row and column address are strobed into the chip sequentially using the Row Address Strobe (RAS) and Column Address Strobe (CAS) lines. Once both RAS and CAS have been asserted the user has the option of deasserting and reasserting CAS to initiate another RAM cycle. The details of what happens during this second cycle depend on whether the RAM is of the page or nibble mode variety. If the RAM uses page mode access the value of the Address lines is used to access a new column of the RAM. Thus, any bit in the "page" (i.e. same row address) can be accessed without going through the RAS portion of the cycle. In nibble mode, the address during subsequent CAS cycles is ignored and the next data in sequence is delivered. The data sequence is determined by a wrap-around count of the low order bits of the initial CAS address. Most nibble mode RAMs use the low two bits for nibble mode address providing fast nibble mode access to four bits.
The question of whether to use page mode or nibble mode access for the RAMs has been a hotly contested issue. The key difference between the two schemes is the number of RAMs that must be cycled in order to obtain the required memory bandwidth. In the case of page mode access, the cycle time is sufficiently long (230ns min.) that a full block (256 bits) must be cycled in parallel to obtain the required bandwidth. This leads to problems associated with how to multiplex the 256 bits into a quantity that can be fed to the error correction chip(s) without consuming large amounts of board area and power. The advantage of using a nibble mode cycle is that one obtains the 256 bits time multiplexed in a 4 bit serial fashion from the RAMs. This is done at the expense of a somewhat longer cycle time, but no external components are needed to reduce the pin count. A cycle by cycle analysis of the two schemes is given below:
Page Mode Read Access (40
ns cycles)
Cycle
Bus Operation
Memory
ECC
0 Grant
1 Read Block Cmd
2 Victim Address
3 Request Start Access
4
5 Grant Data 0
6 Address Data 1 Data 0
7 Data 0 Data 2 Data 1
8 Data 1 Data 3 Data 2
9 Data 2 Data 3
10 Data 3
Nibble Mode Read Access (40ns cycles)
Cycle
Bus Operation
Memory
ECC
0 Grant
1 Read Block Cmd
2 Victim Address
3 Start Access
4
5 Data 0
6 Request
7 Data 1
8 Grant
9 Address Data 2 Data 0
10 Data 0 Data 1
11 Data 1 Data 3 Data 2
12 Data 2 Data 3
13 Data 3
Tradeoffs
Bank Size - for page mode the minimum memory size and increment is 32 MBytes. This is quite large. For nibble mode the minimum size and increment is 8 MBytes.
Area - for page mode a large bank size implies large area increments for additional memory. Also the multiplexers to route the data to the ECC chips require substantial area.
For a fee of about $6K plus $6 per module Texas Instruments will mount the muxes on the same substrate as the RAMs if we use their SIP DRAM packaging technique. However this makes using memories from other manufacturers difficult.
Latency - using nibble mode adds approximately 120nS to the latency of the memory over page mode. Current estimates are that this will cause approximately a 5% loss in uniprocessor performance.
Bandwidth - under the assumption that the majority of bus commands will be Read Blocks with a one command FIFO, the page mode controller should be able to pipeline the operation of the RAMs in such a way as to completely saturate the bus. The nibble mode controller due to the increased latency and reduced RAM bandwidth at best approaches 50% bus saturation. Thus a minimum of two controllers are needed to maximize the number of processors that can be serviced.
Power - the biggest advantage of nibble mode over page mode is in the area of power dissapation. In a DRAM the highest power consuming operation is the cycling of the sense amplifiers. Nibble mode accesses which do not cycle the sense amps use considerably less power. When using page mode 288 chips are cycled at a maximum operating current per chip of 100mA (Fujitsu spec). For 100% duty cycle the power dissapation is 144 watts, or 3/4 of the 6085's +5 current capability. This is indeed a frightening number. To compare the two access schemes on a equal level look at the energy required to read a single block:
Page mode: 120ns * 100mA * 5V * 288chips
= 17.3uW-Sec/Read Block
Nibble mode: (120ns * 100mA + 3 * 70ns * 25mA) * 5V * 72chips
= 6.21uW-Sec/Read Block
This does not include the power necessary for the multiplexor chips needed for page mode operation. An example mux chip would be a 74ALS648. These add an additional 4.3uW-Sec/Read Block. Thus page mode accesses consume nearly four times the power of nibble mode accesses.
Page Address Caching - if page mode is used and the controller is smart enough, the memory latency can be reduced by one or two cycles when a page address hit occurs. There is currently no simulation data available to indicate how useful this feature is.
Bus Loading - in order to achieve full bus bandwidth from the memory system when using nibble mode it is necessary to have at least two memory controllers on the bus. This increases bus length and capacitive loading somewhat, but in the propagation mode bus scheme should pose no great problem.
Packaging
The most attractive form of packaging for the DRAMs is in SIP (Single Inline Package) or SIMM (Single Inline Memory Module) form. The SIP package has leads while the SIMM one has a card edge style connector and requires a socket. In both of these package styles one can buy modules with eight or nine RAMs, mounted on one or both sides, with the die arranged with their long edge horizontal or vertical. According to one rep 90% of the sales go to single-sided 8 or 9 RAM modules with the die vertically mounted. Using this style of module will cause board spacing problems. It would be preferable not to use two board slots to accommodate the RAMs. Information will soon be available on angled sockets for reducing RAM height.
Drive Considerations
The RAM modules have a capacitance of 80pF for some of their input lines. Eight such modules in parallel will total nearly 700pF. It seems clear that some sort of external buffer will be necessary to drive this load. There are two main problems to worry about when driving such loads. The first is ground bounce. The currents one encounters when driving ten 700pF address lines can be enormous resulting in large voltage differences between ground on the driving chip and ground on the RAM. This can cause all sorts of problems which result in loss of data. To solve this problem one need only ensure that half of the RAMs are driven with the address while the other half of the RAMs are driven with the complement of the address. In such a system the currents in the ground plane sum to zero and there is very little potential difference between ground on the driving and receiving chips.
The second potential problem associated with driving DRAMs is overshoot (or undershoot). The inductance of the PC board trace and the capacitance of the RAM inputs combine to form a transmission line. If the output impedance of the driver is not matched to the impedance of the line reflection from the end of the line results in overshoot on the driven line. With SIP technology the board spacing will be much closer than conventional DIPs. It may be the case that termination is not needed due to the very short line, but this is not clear. Parts have been purchased to build a test jig to measure the magnitude of this problem.