Page Numbers: Yes X: 527 Y: -.5" First Page: 109
Heading:
Dorado Hardware ManualDisplay Controller14 September 1981
Display Controller
The Dorado Display Controller (DDC) uses the fast io system to obtain representations of video images from storage; it then transforms these representations into control signals for monitors. Its three design objectives are:
(1) To handle a variety of color, grey-level, and binary (black-and-white) monitors;
(2) To utilize the full power of the fast io system in producing high-bandwidth computer graphics;
(3) To allow various compromises in color and spatio-temporal resolution for experimental purposes. Clock rates, video signals, and other monitor waveforms should be controllable by firmware.
There are two independent video channels capable of running in a variety of modes. Two channels allow text to be displayed on one channel, graphics on another, or the main picture on one, cursor on the other.
The DDC must readily handle Alto-style and LF (large format) monitors which we expect to be standard for most systems. Bit maps, display control blocks, and monitor control blocks, similar to those used on the Alto, provide the software interface to the DDC. The "seven-wire" video interface makes provision for one or more low bandwidth input devices (keyboard, pointing device, etc.); our current provisions for keyboard and mouse input are also discussed in this chapter.
Keep Figure 14 in view while reading this chapter.
Operational Overview
Video scan lines are encoded in bitmaps, which are contiguous blocks of virtual memory; the two channels, A and B, have independent bitmaps and data paths in the DDC. The high-priority DWT (Display Word Task) runs on behalf of either A or B using the subtask mechanism; it transmits each bitmap to a FIFO consisting of 15 munches/channel. The bitmap stream emerging from the FIFO is then sorted into items (1, 2, 4, or 8 bits wide) for each channel which are combined, mapped, and transformed into pixels (picture cells) on the screen.
In addition to the two channels, the DDC supports a programmable cursor that is 16 pixels x 1 bit/pixel wide.
A lower priority DHT (Display Horizontal Task) handles horizontal and vertical retrace and sets up starting addresses and munch counts, cursor data, and formatting information in the NLCB (Next Line Control Block) for the DDC. The NLCB is then copied into the CLCB (Current Line Control Block) during horizontal retrace prior to the next scan line.
The rate-of-flow of items is governed by the resolution and pixel clock period. Resolution may be independently programmed for each channel so that items flow at 1/4, 1/2, or 1 times the pixel clock period. If the DispM board is present, then the pixel clock period is also progammable; otherwise, it is determined by a crystal oscillator on the DispY board, which must have a frequency appropriate for the monitor being driven.
Items can be treated in one of three ways: First, an Alto monitor can be driven. Second, items can be mapped through the 256-word x 4-bit MiniMixer into video data for a black-and-white or grey-level monitor.
Three separate interfaces are provided on the DispY board. An Alto monitor interface ORs one-bit items from the A and B channels with the cursor, and then XORs by polarity to produce one-bit pixels for an Alto display. A seven-wire interface outputs 1 bit/pixel for a binary monitor. And an 8-bit digital-to-analog converter (DAC) produces grey-level video.
Third, items may be mapped by the Mixer (or A color map), a 1024-word x 24-bit RAM, into signals for a color or grey-level monitor. A variety of modes determine which bits from the A and B items address the mixer. Mixer output consisting of 8 bits for each of the red, green, and blue guns is then digital-to-analog converted for color monitors. Additionally, there is a 24-bit/pixel mode in which the Dorado supplies 8 bits for each of the three colors; the colors are independently mapped through the Mixer and two additional 256-word x 8-bit RAMs called the BMap and the CMap.
The DDC is implemented on two Dorado main logic boards, called DispY and DispM. DispY contains all the logic necessary for vertical and horizontal sweep control, channel data paths, and video data for binary and grey-level monitors running at a fixed pixel clock rate. DispM contains the color maps, the programmable pixel clock, and the three DACs for driving a color monitor. Additionally, DispM contains an independent terminal controller that is structurally similar to a one-channel, one bit/pixel DispY but is specialized to driving a 7-wire terminal.
Thus there are two principal DDC configurations. On a Dorado with only a 7-wire terminal and no color monitor, only the DispY board is present; it is programmed for Alto terminal emulation, and only a small subset of its capabilities are used. However, on a Dorado with both a 7-wire terminal and a color monitor, the DispM board is also present; all of DispY and the color hardware on DispM are used to drive the color monitor, and the independent controller on DispM is used to drive the 7-wire terminal.
Video Data Path
Fast IO Interface and FIFO
The fast io system delivers data to the DDC at a rate of 16 bits/clock; words are received alternately in the REven (t1) and ROdd (t2) registers shown in Figure 14, then written into the FIFO, a 256-word x 32-bit RAM, during the first half of the next Dorado cycle (t2 to t3), leaving the second half of the cycle free for read access by the video channels. In other words, the REven and ROdd registers widen the data path from 16 to 32 bits to allow sufficient time to both write and read the FIFO in one cycle.
The 256 double-words in the FIFO are divided evenly among the two channels, so each has buffer storage for 16 munches. Each channel has write and read pointers that address the FIFO when appropriate.
Write pointers are initialized once during vertical retrace and then sequence through addresses for the entire display field; a write pointer is incremented after each double-word write for its channel, so that the next word to be written is addressed at all times. Since the fast io system delivers only one munch at a time, there is never any problem in deciding which of the two write pointers should address the FIFO.
Read pointers, however, are initialized during each horizontal retrace, so that the correct first double-word is read at the start of every scan line. This is required because the fast io system always delivers complete munches, but unused double words may appear at the end of the last munch for the previous scan line, or at the beginning of the first munch for the current scan line; the read pointer has to be reinitialized to skip over these. FIFO reads alternate between channels A and B, so the data rate for one channel is limited to 32 bits/2 cycles (=16 bits/cycle).
Note that bitmaps are required to start at even addresses because the FIFO is 32 bits wide.
Item Formation
At the output end of the FIFO there is a multiplexor shared by both channels and, for each channel, two intermediate buffers (FIB and SIB), and a shift register SR. The multiplexor permutes the 32-bit quantity emerging from the FIFO so that when the double-word has marched through FIB and SIB and is finally loaded into SR, successive shifts will produce successive items of the selected size (8, 4, 2, or 1 bits).
The SR is tapped as follows:
SR.0Item[0] for item sizes 1, 2, 4, or 8;
SR.16
Item[1] for sizes 2, 4, or 8, gated to 0 for size 1;
SR.8, SR.24
Item[2:3] for sizes 4 or 8, gated to 0 for sizes 1 or 2;
SR.4, SR.12, SR.20, SR.28
Item[4:7] for size 8, gated to 0 for sizes 1, 2, or 4.
All eight Item bits are gated to 0 if the channel is off. It is useful to think at this point that, regardless of a channel’s item size, an 8-bit wide item is produced, whose bits contain non-zero data only in those positions dictated by the item size; i.e., for size 1 only the most significant bit may be non-zero; size 2 allows data in the topmost two bits, etc.
The SR loads on the item clock after its last item has been used; the item clock rate is the pixel clock rate divided by the resolution (1, 2, or 4 for full, half, or quarter, respectively). Hence, for 8, 4, 2, or 1-bit items, SR will be shifted 3, 7, 15, or 31 times, repectively, and be reloaded from SIB on the following item clock.
Synchronization of SR, which uses the item clock, with FIB and SIB, which use the Dorado system clock, is a little tricky. SIB←FIB will occur no later than (4.6 ns)+C+(1.1 ns)+C+C = 3*C+5.7 ns after SR←SIB, where C is the period of the Dorado system clock and 4.6 ns and 1.1 ns are the worst case propagation delay and setup time of the components in the synchronizer; FIB←FIFO will occur at this time or on one of the next three Dorado clocks, depending upon which of these four clocks corresponds to t2 of the cycle in which this channel can read the FIFO. Allowing for propagation delay through SIB (5.0 ns) and setup time for SR (1.7 ns), the worst case minimum spacing between loads of SR is 3*C+(5.7 ns)+(6.7 ns) = 3*C+12.4 ns. This must be less than the time for emptying SR which is I*(32/ItemSize), where I is the period of the item clock. Hence, I > (3*C+12.4)/4 for ItemSize=8, or I > 25.6 ns for a Dorado clock period of C = 30 ns.
The 8-bit items from the two channels are then presented to either the Mixer section on the DispM board or the MiniMixer or Alto video interface on the DispY board.
Mixer
The Mixer is controlled by the A8B2, BBypass, and 24Bit mode controls. It is a 1024-word x 24-bit RAM for which the 10 bits of address required may be obtained from two possible source distributions, depending upon the A8B2 mode. When A8B2 is true, the address consists of AItem[0:7] and BItem[0:1]; when false (called A6B4), the address is AItem[0:5] and BItem[0:3].
Another mode, the BBypass mode, can be enabled independently for the B channel. If B is bypassed, none of its bits contribute to the Mixer address. Instead, they bypass the mixer and address a 256 x 8 RAM, the BMap, whose outputs are ORed with the mixer outputs for the blue DAC. For example, with ASize=8, BSize=4, BBypass true, and A8B2 true, and with appropriate values in the Mixer RAM, the controller may be thought of as three 4/bits pixel channels driving three color guns. One channel is bypassed data from B, while the other two are mapped through the Mixer.
24Bit mode, used in conjunction with BBypass mode, is used to run a three-channel color display directly from memory. In this mode, items from the A channel alternately address the Mixer (called the AMap in this mode) and another 256 x 8 RAM called the CMap. Meanwhile, the B channel runs at half the A channel rate and addresses the BMap as described above. (That is, the B channel must be set to one-half the resolution of the A channel.) With suitable values in the color maps, the AMap, BMap, and CMap independently generate outputs for the red, blue, and green DACs respectively.
Note: when the A channel is turned on, the first AItem addresses the AMap and the second AItem addresses the CMap. For the A and B pixels to align properly on the display in 24Bit mode, the left margin counts must be set to start the B channel one pixel clock earlier than the A channel. The blue and green portions of the AMap must be entirely zeroed, since the blue and green outputs are ORed with the BMap and CMap.
After routing as dictated by the mixer modes, chosen items are loaded into the map address registers, causing the color maps to produce a new video value every pixel clock (every two pixel clocks in 24Bit mode), and these values are latched in the three 8-bit mixer output registers. Three very fast DAC modules then produce a Red-Green-Blue triple of analog signals for a color monitor, or up to three grey-level video signals. In conjunction with the sync, blank, and composite waveforms produced by the monitor control circuitry, these signals can drive a wide variety of monitors attached to the Dorado.
Alto Video Interface
A small circuit on the DispY board produces video for an Alto monitor. This circuit ORs CursorData, AItem[0], and BItem[0], then XORs by the polarity, and finally ORs with the vertical and horizontal blanking signals. This interface is obsolete and is no longer in active use.
MiniMixer
A small video mixer on the DispY board, not to be confused with the large Mixer on the DispM board, can drive either a DAC or the seven-wire interface discussed later. The MiniMixer is a 256 word x 4-bit RAM addressed by a combination of AItem, BItem, and state bits, as shown in Figure 14. On every pixel clock, dDAC[0:3] are loaded from MiniMixer output, while dDAC[4:7] are loaded directly from AItem[4:7]. The MiniMixer aims at experiments with mixing channels and driving grey level monitors.
Horizontal and Vertical Control
Every monitor requires horizontal synchronizing and blanking waveforms. Interlaced monitors must be able to distinguish fractions of a scan line to implement interlacing. In general, the duration and phasing of sync/blank waveforms is unique to a given monitor. The DDC uses the 1024-word x 3-bit HRam (Horizontal RAM) to control horizontal sync/blank.
The DDC has a set of registers called the CLCB (Current Line Control Block) which controls video generation for the current scan line. The DHT sets up parameters for the next scan line in NLCB (Next Line Control Block), a 16-word x 12-bit RAM. The first 32 pixel clocks of horizontal blanking are called the HWindow; during HWindow parameters for the next line are copied from NLCB into CLCB. Vertical control is also handled through the NLCB.
The interpretation of fields in NLCB and HRam are shown in Figure 15 and loading will be discussed in the "Slow IO Interface" section; the use of the different information is discussed here. The top part of Figure 14 shows how horizontal timing is controlled.
Line Control Blocks
The fields in NLCB/CLCB are interpreted as follows, where a denotes that the item is channel-specific (i.e., copies exist for both A and B channels):
aPolarity. A single bit, used only for binary monitors, that inverts black and white (APolarity and BPolarity are or’ed by the hardware).
aResolution. A 2-bit field that controls item clock generation; values of 0, 2, and 3 cause quarter, half, and full resolution, respectively.
aItemSize. A 4-bit field unary encoded as aSize1, aSize2, aSize4, or aSize8, denoting bits/pixel for the channel; setting multiple bits is illegal.
aLeftMargin. A 12-bit field in units of pixel clocks specifying 31 less than the number of pixel clocks to wait after HWindow completes before turning the channel on. This value is not a straightforward constant, but depends upon monitor-specific horizontal blanking time. If the horizontal blanking time is B pixel clocks and the desired beginning of data is L pixel clocks after the end of horizontal blanking, then aLeftMargin should be loaded with B+L−32−31 = B+L−63, independent of resolution. Since L may be 0, this implies that the horizontal blanking time for the monitor must be greater than 63 pixel clocks. Since high-speed monitors typically have greater than 4 ms horizontal blanking times, and are this fast only with high speed pixel clocks, this restriction is not expected to be significant.
Note: For a monitor connected via the 7-wire interface, aLeftMargin must be B+L−68, rather than B+L−63, because video signals are delayed from horizontal control waveforms by 5 pixel clocks.
Note: The value loaded into aLeftMargin must actually be the negative of the left margin count computed above.
aWidth. A 12-bit counter that counts at the pixel clock rate as soon as the channel turns on; when the counter runs out (or when horizontal retrace starts, whichever is earliest), the channel is turned off. Precisely, if the channel is to run for W pixel clocks, the width counter must be loaded with −(W+255).
aFifoAddr. An 8-bit quantity pointing to the munch and word within the munch for the first FIFO read for the next scan line; this must be an even number because doublewords are fetched from the FIFO. Firmware must keep track of the number of used munches for any given line and advance aFifoAddr by exactly the right amount, adjusting for munch boundaries, interlacing, and data breakage. The CLCB register for aFifoAddr is the channel read pointer itself.
MixerModes. A set of bits that control the mixer; these are not channel-specific. These will normally be changed infrequently, maybe at the field rate or during display initialization. However, they are in the NLCB to allow modes to change on the fly.
Vertical Control Word (VCW). A word controlling the vertical retrace operation of the monitor; it contains the vertical blank bit, vertical sync bit, and interlace field bit discussed in the "Vertical Waveform Generator" section below.
Cursor and CursorX. The 12-bit CursorX value is loaded into a counter which starts counting at the end of HWindow. When the counter runs out, the 16-bit Cursor value is shifted out onto the CursorVideo line. This is used by the Alto video interface and in the MiniMixer address. Precisely, if horizontal blanking is B pixels in duration, and the leftmost bit of the cursor is to appear X pixels beyond the end of horizontal blanking, then the CursorX register must be loaded with −(B+X+226), or −(B+X+221) when using the 7-wire interface.
Horizontal Waveform Generator
The 1024-word x 3-bit HRam contains control information for these waveforms. Under normal operation, HRam is addressed by a 12-bit counter (HRamAddr[0:11]) which is reset at the leading edge of horizontal sync and then increments every pixel clock until the next leading edge of horizontal sync; HRamAddr[1:10] address the RAM, and the output is loaded into the HRamOut register every other pixel clock. The three bits in HRamOut control horizontal sync, horizontal blank, and half-line; these three bits are combined and level shifted by a logic network appropriate for the monitor being driven.
The 1024-word HRam imposes the uninteresting restriction that there be fewer than 2048 pixels/scan line.
As shown in the diagram at the top of Figure 14, horizontal blanking (HBlank) is true from the end of one scan line to the beginning of the next. During horizontal blanking, HSync is turned on to initiate the horizontal retrace and turned off again when horizontal retrace is finished. HBlank then continues for a monitor-specific interval. Note that if a channel’s visible left margin is non-zero, then the horizontal scan will begin before that channel is producing any data; in this case, the video channel outputs zero items to the mixing stages until the channel is turned on.
Due to an implementation error, when the 7-wire interface is being driven from DispY, the value of HBlank[i] may differ from HBlank[i−1] only when i is even, where i is HRamAddr[1:10].
Vertical Waveform Generator
Only 2:1 interlaced monitors are supported in this design, but more complicated vertical control could be provided, if desired. To support 2:1 interlace, HRam contains a waveform called HalfLine, which is a pulse at the horizontal line frequency, 180o out of phase with HSync.
Vertical control is handled by DHT through the NVCW word in the NLCB, which specifies whether or not vertical blank or retrace should begin or end during the next scan line. The DHT microcode must keep track of scan lines to enable vertical signals at the appropriate times.
The three VCW bits are called VBlank, VSync, and OddField. VSync enables vertical sync to begin on the next line, and the OddField bit chooses either HSync or HalfLine on which to do vertical syncing (OddField=1 implies HalfLine phasing for vertical sync). This phase will alternate from the start of the line to the middle of the line and back for successive fields. The blanking signal for the monitor is VBlank ORed with HBlank.
Pixel Clock System
The programmable pixel clock on the DispM board, if present, determines the fundamental video data rate for a given monitor. The pixel clock is controlled by loading the PixelClk register via the slow io system. The pixel clock frequency is (312.5*(241−M))/(16−D) KHz, where M is PixelClk[4:11] and D is PixelClk[12:15]. Note that the pixel clock will not stabilize until about 1/2 second after the PixelClk register is loaded.
The parts of the DDC synchronized to the rest of Dorado do, of course, use the Dorado system clock. As discussed earlier, the synchronization logic for refilling SIB after SR←SIB puts a lower bound on the pixel clock period of (3*C+12.4)/4 ns (= 25.6 ns for a Dorado clock period of C = 30 ns), for an item size of 8 on either channel. We anticipate that pixel clock rates in the range 10 to 50 MHz (100 to 20 ns/pixel) will be required, so the lower bound is approximately consistent with this.
Seven-Wire Video Interface
So that a number of different controller and terminal types may be freely interconnected in Dolphin and Dorado-based systems, a common interface between terminals and controllers has been defined. This interface assumes that a terminal contains a raster-scanned bitmap display and one or more low bandwidth input devices (keyboard, pointing device, etc.) The DDC transmits digital video and sync to the terminal over six pairs of a seven-pair cable. The input data is encoded by a microcomputer in the terminal and sent back serially over the seventh pair (the "back channel"). Video and control (sync) are time-multiplexed, and four bits are transmitted in parallel to reduce the cable bandwidth required.
While the description in the following sections assumes a display having one bit/pixel, the basic signalling mechanism may be extended to support gray-level or color displays.
Video Output
The four output lines are interpreted as either a 4-bit nibble of video or four control signals according to the phases of the two clock signals; the DDC places data on the data lines at the falling edge of ClkA, and the terminal samples this data on the rising edge of ClkA. If ClkB is 1 at this time, the nibble is interpreted as four bits of video, else as sync and control information. ClkA and ClkB are transmitted in quadrature so that the terminal can reconstitute a clock at the video bit rate.
When a nibble is interpreted as control information, bit 2 is reserved for horizontal sync and bit 3 for vertical sync, while 0:1 are undefined; different types of terminals may use 0:1 for any purpose.
A circuit on the DispY board drives the seven-wire interface from the MiniMixer. MinMixer[0] is serial-to-parallel converted into four-bit nibbles, which are held in a register for transmission. Sync, blank, and clock phases are generated in accordance with the seven-wire interface specification.
Back Channel
Data from low bandwidth input devices at the terminal are transmitted serially over the back channel. Data are clocked by the terminal on the rising edge of the horizontal blank pulse and are sampled by DHT during the subsequent scan line after HWindow.
By convention the terminal microcomputer encodes 32-bit messages (delivered in 32 scan lines); each message begins with a 1, and after the 32nd bit of the message the DHT ignores the backchannel until the start of another message is indicated by another 1. The message consists of a start bit, 3 unused bits, a 4-bit message type, a 16-bit message body, and finally an 8-bit trailer which must be 2008.
The terminal microcomputer perpetually cycles through all possible keys on the keyboard (as well as mouse buttons and keyset paddles), detecting changes in state of the keys; the state of the keyboard then exists in seven 16-bit words, and a back channel message is defined for each. Whenever one of these words changes value, it is sent to the Dorado in a message. Additionally, changes in mouse x,y coordinates are reported once per field (i.e., twice/frame or typically 60 times/sec). If the mouse has not changed position during a field, then one keyboard word is reported instead of the mouse position change; thus, the correct state of the keyboard is eventually reported even if transitions are missed.
Table 24: Terminal Microcomputer Messages
Message
 Type
Comments
00BIllegal—ignored
01BKeyboard word 0 (corresponds to Alto memory location 1077034B)
02BKeyboard word 1 (Alto 177035B)
03BKeyboard word 2 (Alto 177036B)
04BKeyboard word 3 (Alto 177037B)
05BMouse buttons and keyset (Alto 177033B)
06B8-bit changes in X-coordinate (0:7 of the message body) and Y-coordinate (8:15 of the message body), represented in excess-200B notation
07BIllegal—ignored
10BKeyboard word 4 (Star keyboards only; no Alto analogue)
11BKeyboard word 5 (Star)
12B–16BIllegal—ignored
17BBoot message. Actually, depressing the boot button jams the data to one continuously, rather than generating a valid terminal message. Furthermore, when the boot button is let up, there may be as many as 8 bits of garbage following the last consecutive one bit; these must be ignored by the firmware. The firmware should also ignore boot button pushes less than 10 ms in duration, as these may be caused by noise or contact bounce.
Processor Task Management
This section outlines the implementation requirements of DHT and DWT and discusses the hardware associated with task wakeups and DWT subtask arbitration between the two channels.
Since DHT must do a lot of processing, it runs at low priority and is awakened once/scan line at the end of HWindow. When it runs, it must calculate all parameters for the next scan line (i.e., the one after the scan line that is just starting), load the NLCB appropriately for each channel, and set up the munch address and count for each channel in the RM registers aNextAddr and aNextCount referred to in the DWT sample code below; then it sets the aNextWCBFlag flags discussed below. The DHT wakeup will remain active until any NLCB output command is executed, so the DHT must execute at least one NLCB output command every time it wakes up, and this must occur at least three instructions prior to blocking.
DWT is a very high priority task which may run on behalf of either channel: channel A is subtask 0; channel B, subtask 2. Since it uses the subtask mechanism, DWT must always block at the same instruction each iteration. DWT does not explicitly know the channel for which it is executing at any given time; its two parameters, a start address and munch count, are received from DHT in RM registers specific to the subtask. In the normal case, DWT initates an IOFetch and blocks. The following is the main-line DWT microcode presently in use:
%RM registers for channel A, indicated by names beginning with "A" below, are used in the program, but the corresponding set of registers for channel B, in a different RM region, will be referenced when SubTask is 2.
Note that TIOA selects the DWTFlag register and T contains 20 at the beginning of the loop, so the second instruction is used both to increment the munch address and to signal the hardware that an IOFetch is commencing.
%
DWTStart:
ACount←(ACount)−T, Branch[DWTCheck, R<0];
AAddress←(IOFetch←AAddress)+(Output←T), Block, Branch[DWTStart];

%AAddress will be even if we just exhausted a scan line. AAddress will be odd if we have just been awakened to start a new scan line. In either case, isolate flag in AAddress[15] for use in adjusting the WCB flags.
%
DWTCheck:
AAddress← (AAddress) AND (1C), Branch[DWTAdjustWCBFlags, R even];
%Note that the change-RSTK-for write function used below is ok, but the change-RBase-for-write functions are illegal because of subtasking.
%
DWTRefill:
ACount←ANextCount;*from DHT, # munches to fetch -1 in 0:11
BrLo←ANextAddrLo;*first munch address
BrHi←ANextAddrHi;

%Now adjust WCB flags, as follows: If we just exhausted a scan line, AAddress=0 now; execute Output←0 to clear the CurWCB flag, and set AAddress to −1 for the next wakeup. If we are starting a new scan line, AAddress=1 now; execute Output←1 to set the CurWCB flag and clear the NextWCB flag, and set AAddress to 0 for the first IOFetch.
%
DWTAdjustWCBFlags:
AAddress← (AAddress)−1, Output←AAddress, Block, Branch[DWTStart];
DWT lowers its wakeup request at the onset of the DWTStart instruction, and the DDC remembers that DWT is in progress. No further DWT wakeups will be generated while the task is running or is preempted by a higher priority task. Whenever DWT blocks, a counter is initialized to a constant value N and counts once per Dorado cycle; when the counter runs out, DWT wakeups are allowed again. This counter has two purposes. First, within a munch loop it spaces out IOFetch references to the memory system by 8 or more cycles (depending upon N, which is adjustable through a hardware SIP component), so as not to clog the memory pipeline. Second, the decision to generate subsequent DWT wakeups is based upon the state of flags that may be altered by output commands; these commands take time to get from the processor to the DDC and alter the state. Other tasks may have the processor while these state changes take effect.
After N cycles have elapsed, DWT will be woken whenever aWantsDWT is true for one of the channels. Two channel-specific flags are involved in DWT wakeup control: aCurrentWCBFlag is true when a is actively moving words into the FIFO; aNextWCBFlag is set true by DHT after it has loaded the munch address and munch count into DWTnextaddr and DWTnextcount for a. After fetching the last munch for a scan line, DWT clears aCurrentWCBFlag and blocks unless aNextWCBFlag is true. In other words, aWantsDWT when
(aNextWCBFlag & not aCurrentWCBFlag) %
(
aCurrentWCBFlag & aFifoAvailable).
If only AWantsDWT or only BWantsDWT, no conflict arises and the requesting channel gets DWT. However, if both channels want DWT, the channel that ran least recently will run next.
Two observations must be made about the DWT microcode. First, because the final instruction is normally an IOFetch←, the next instruction executed (by another task) will be held one cycle if it initiates any memory reference. Secondly, the two instruction loop above requires that the hardware cope with the NextLies condition discussed in the "Slow IO" chapter; a pathological lockout problem could occur if a high demand task of higher priority is coded so that it always creates NextLies (say, by doing Block and immediate ←Md in the instruction after a fetch). This would result in the DWT wakeup being frequently delayed by 2 cycles.
Note: Neither DWT nor DHT drives the IOAtten branch condition.
Slow IO Interface
DDC manages all control functions via the slow io system. At this point you should study Figure 15, which shows the format of the various output and input commands; there are six output devices and one input device on the DispY board, and eight output devices and one input device on the DispM board (if present). Output commands are handled uniformly: TIOA is clocked into a register at t1; the register output is decoded and identified as one of the DDC commands; if the processor is doing an Output←B, then at t3 IOB data from the processor is clocked into a register and one of the "TIOA command" pulses occurs from t3 to t5, at which point the desired action is complete.
The IOB data received at t3 of an Output←B will remain in the DDC buffer register (RIOB) until the next output command. This is useful for debugging and for muffler readout of the NLCB (because an NLCB address can be loaded into RIOB for multiple cycles).
The HRam, MiniMixer, Mixer, BMap, and CMap are RAMs that will generally be loaded during system initialization and not often changed while pictures are being displayed. The programmable pixel clock will also be loaded during initialization, if it is being used instead of the fixed crystal oscillator.
The HRam, Mixer (AMap), BMap, and CMap addresses each have two independent sources: the Dorado slow io system and the video system. Video system addressing is disabled during loading from the Dorado. The output commands to each of these RAMs are interpreted as follows: The Keep’ bit is saved in a flipflop loaded by every RAM output command; as long as Keep’ is true (i.e., low), video system addressing is off. If LoadAddr is true, then IOB[4:15] are loaded into the RAM address register. If Write’ is true (i.e., low), the currently-addressed word of the RAM is written from the data field; additionally, the RAM address register increments after writing, so the RAM can be loaded sequentially at high speed. A RAM output command with Keep’ false (i.e., high) releases the RAM from Dorado control and returns it to the video system.
Note: the LoadAddress and Write’ bits of a RAM output command take effect only if the Keep’ flipflop is already true (i.e., set to zero by a previous RAM output command).
Note: in the case of the Mixer, the RAM address is loaded from IOB[4:14] and a Hi/Lo Select bit is loaded from IOB[15]. The latter bit determines which 12 bits of the 24-bit wide mixer word will be loaded by the next Write’. The Hi/Lo Select bit behaves as a low-order extension of the Mixer address counter, so successive Write’ commands will alternate between the halves of one mixer word before advancing to the next.
The MiniMixer is loaded by a single output instruction that specifies both the address and data to be loaded. During the command pulse from t3 to t5 of the Output←B instruction, the video channel address to the MiniMixer is replaced by the address being loaded, so if the video channel is active, garbage may appear at the output during this cycle.
The 16-word x 12-bit NLCB is also loaded by single output instructions that specify both the address and data. For the NLCB, output instructions are only effective when HWindow is not occurringduring HWindow the RAM address is supplied by a counter that successively copies the NLCB words into CLCB. The format of each of the words in NLCB is shown in Figure 15. Note that any NLCB output operation will dismiss the wakeup request for DHT, and DHT must not block any sooner than the fourth instruction after the first NLCB output operation is issued.
The Statics output command is used for debugging and initialization. Two bits in the Statics register called DHTShutUp and DWTShutUp are discussed in the "DDC Initialization Requirements" section below. Three other fields called FakePClk, UseFakePClk, and MufAddr are used for debugging. When UseFakePClk is true, the regular pixel clock is degated; if FakePClk is true, then a pixel clock will occur at t5 of the Statics output command; otherwise no clock occurs. Every Statics command also loads the hardware signal addressed by MufAddr into a flipflop (at t5) which can be read by the Status input command discussed below. In combination, the fake pixel clock and muffler readout features allow diagnostic firmware to checkout most of the internal data paths in the DDCby simulating a very slow pixel clock and "stepping" the DDC through various states, the diagnostic can check nearly all of the data paths between fake pixel clocks. The hardware signals selected by MufAddr[5:11] are given in the table below.
Table 25: DDC Muffler Signals
MufAddrSignalMufAddrSignal
0ACurrentWCBFlag 70AFifoFull
01:07
AReaderPtr[1:7] 71BFifoFull
10
ANextWCBFlag 72ASize8
11:17
AWriterPtr[1:7] 73ASize8-4
20
BCurrentWCBFlag 74ASize8-4-2
21:27
BReaderPtr[1:7] 75BSize8
30
BNextWCBFlag 76BSize8-4
31:37
BWriterPtr[1:7] 77BSize8-4-2
40:47
AItem[0:7] 100AOn
50:57
BItem[0:7] 101BOn
60:63
AServicePtr[1:4]102:103ARes[0:1]
64:67
BServicePtr[1:4]104:105BRes[0:1]
106MonitorType
Muffler 106 (MonitorType) is the only one of interest during normal operation. It identifies the type of monitor connected via the 7-wire interface: zero denotes an Alto-style monitor; one denotes an LF (large format) monitor.
A single input device called Status is implemented. It is used to return the currently selected muffler bit and the seven-wire interface received data bit.
The MapInLo and MapInHi input devices read the current values output from the color maps (Mixer, BMap, and CMap, whichever are active). When the color maps are controlled by the video system, these outputs change too rapidly for reading them to be useful (unless the DDC is being single-stepped by means of UseFakePixelClk). However, when the color maps are controlled by the Dorado, this input device can be used to read out the color map entries addressed by their respective RAM address registers.
MapInHi[0] is the 7-wire terminal input bit for the independent terminal interface on DispM; its position corresponds to Status[0] on DispY (see below). MapInHi[1] is a constant 1 if a DispM board is installed; if DispM is not installed, an Input from the nonexistent register yields a zero value. This enables firmware to detect the presence or absence of a DispM board. MapInLo[0:3] are a 4-bit color monitor type jumpered on the Dorado backpanel.
Note: the MapInLo and MapInHi input devices do not generate IOB parity, so they must be read by the Pd←InputNoPE function to disable parity checking.
DispM Terminal Interface
The independent terminal interface on the DispM board functions much the same as a single-channel DispY board, but is specialized to driving a binary monitor via a 7-wire interface. The data path is one bit/pixel; the resolution is full; there is no MiniMixer; and the horizontal waveforms are fixed by a PRom (which must be changed when a different type of 7-wire terminal is installed).
Aside from these limitations, the DispM terminal interface operates almost identically to the A channel of DispY. In particular, the io addresses are grouped parallel to the ones on DispY, and the data formats are identical; so a microprogram can initialize TIOA to the correct group and subsequently use the function that changes only TIOA[5:7] to select registers within that group. This enables practically all the microcode for driving a 7-wire terminal to be shared between DispY and DispM.
In Figure 15, the DispY io operations that are also defined for DispM are marked with an asterisk. Note that outputs to unused NLCB addresses are ignored.
Due to hardware differences between DispY and DispM, the ALeftMargin and CursorX values must be computed slightly differently. For DispM driving the 7-wire interface, ALeftMargin = −(B+L−130) and CursorX = −(B+X+190).
Note: DispM does not have a muffler system. In particular, the MonitorType muffler value is always read from DispY. By convention, this refers to the type of 7-wire terminal attached to the Dorado, whether that terminal is connected to DispY or to DispM. Also by convention, the 7-wire terminal is always connected to DispM if DispM is installed.
DDC Initialization Requirements
The two low-order bits in the Statics register are called DWTShutUp and DHTShutUp. They are forced true by IOReset and prevent the respective task wakeups from happening. They are individually set or cleared by the Statics output command. In addition, IOReset sets the signal DoradoHasHRam; this will prevent horizontal sync from being sent to monitors until the HRam has been loaded and released by firmware. Blanking is sent to monitors as long as DHTShutUp remains true. It is anticipated that DHTShutUp will be left true until all DDC initialization has been completed by the emulator (or by the DHT running in response to a Notify).
Some other initialization requirements are as follows: aLeftMargin should be loaded with a large negative value in case one of the channels remains unused forever; the Cursor in NLCB should be zeroed in case the cursor is completely off-screen forever; HRam must be loaded with monitor-specific waveforms; the pixel clock rate must be set; mixer modes must be set; the MiniMixer must be loaded. In addition, the DHT must explicitly set the aAddress registers to zero on behalf of the DWT, which cannot initialize itself completely for each subtask.
Speed and Resolution Limits
High performance color monitors are typified by the following performance limits:
22 mshorizontal scan time
5
mshorizontal blanking time
800
msvertical blanking time
Parameters for a particular monitor can be modified slightly through hardware adjustments, but cannot be controlled by the DDC, which must provide control signals with timing appropriate for the monitor. Consequently, a monitor must be chosen that conforms to the speed limitations of the DDC.
One important speed limitation is how fast bits can be moved from storage through the DDC. This limit is derived using the following parameters:
FFrame update rate. High speed phosphors require a minimum update rate of 30 frames/sec with interlaced operation for reasonable visual effects; this is marginal and faster update is desirable.
SScan lines/frame.
VRVertical retrace time; with interlaced operation, there will be two vertical retraces/frame.
HBHorizontal blanking time.
HSHorizontal scan time. The FIFO must not go empty during the horizontal scan or garbage will be displayed.
TTime/munch or the rate at which storage can deliver data for IOFetches; this is 1 munch/8 cycles = 1 munch/0.4 ms.
MMunches/scan line that the fast io system can deliver.
The time required to fill the FIFO for both channels is a little longer than 30*8+20 cycles (= 276 cycles) or about 13.8 ms at a Dorado clock period of 25 ns; this follows from the fact that there are 15 munches/channel or a total of 30 munches of FIFO storage, and the fast io system can deliver one munch per 8 cycles with the first munch arriving 20 cycles after the first IOFetch←. 13.8 ms is much smaller than the vertical blanking time and longer than the horizontal blanking time, so the FIFO will start out full at the beginning of a field and will be actively refilling itself during HS+HB of each scan line. If the memory system keeps up with the demands of the video channels, then the FIFO will tend to refill itself after momentary transients in which it empties out a little.
Consequently, we know that HS+HB = 1/(S*F) − 2*VR, and that M = (HS+HB)/T less corrections for refresh references, storage references by other tasks, hold, and delays for tasks of higher priority than DWT. At F = 30 frames/sec, VR = 800 ms, and S = 1000 scan lines, we get HS+HB = 31.7 ms and M = 31.7/0.4 = 79 munches less corrections. There will be an average of two refresh references/scan line, so we get an upper bound of 77 munches = 19,712 bits/scan line from storage.
However, the DWT will not get all storage bandwidth. The DWT wakeup spacing is controlled by a SIP; the smallest reasonable spacing would result in one IOFetch every 8 cyclescloser spacing would result in hold while a preceding IOFetch completed, so more processor cycles would be consumed without improving data rate. At this tightest spacing, DWT runs for 2 cycles out of every 8. Conceivably, worst case memory activity discussed in the "Fast IO" chapter could occur during these 6 cycles (a clean miss 3 cycles before the IOFetch, followed by a dirty miss 2 cycles before the IOFetch, each by a different task). However, the large amount of storage in the FIFO allows us to rely upon statistics to average out memory competition, so it is probably reasonable to allow DWT at least 80% of storage bandwidth or about 16,000 bits/scan line in the above example, which would accommodate 1000 line x 1000 pixels/line x 16 bits/pixel. For HB = 5 ms this is equivalent to a pixel clock period of 26.7 ns.
This is only one speed limitation. Since the 32-bit wide FIFO is accessed once/cycle alternately by the A and B channels (i.e., 16 bits/cycle/channel), and since exactly three doublewords are fetched before the horizontal scan begins for each channel, the maximum bits/scan line for each channel is about (3*32 bits)+[(26.7 ns/pixel)*(16 bits/50 ns)*(1000 pixels/line)] = 8640 bits/scan line. This means that unless both channels are running at the same data rate, the data rate will be significantly below the upper bound determined above. For example, in 24Bit mode, if the A channel runs at full resolution and gets 8640 bits/scan line, the B channel will run at half resolution and get only 4320 bits/scan line, so the maximum data rate would be about 1000 lines x 538 pixels/line x 24 bits/pixel.