DisplayControllerStrawMan.tioga
Hoel, June 9, 1986 6:31:07 pm PDT
Hoel, June 7, 1986 11:38:54 am PDT
Display Controller Straw Man
for June87 Dragon
Introduction
For many years, the Dragon project in CSL has been working on defining and implementing the next generation of personal workstation for use within PARC. The workstation incorporates multiple Dragon processors, implemented with custom VLSI circuits, interconnected in a shared memory architecture. It is intended to be much more powerful than the Dorado workstation it will replace, and will serve as a vehicle for research in multiprocessing software for years to come.
The Dragon project had been attempting to become and remain bus-compatible with the Dragonfly project in Xerox's Electronics Division, and had been relying on the Dragonfly project to provide many components of a Dragon workstation (chassis, backplane, memory subsystem, I/O subsystem, etc.), so that CSL could concentrate on the components unique to Dragon and/or of particular research value.
But recently, the Dragonfly project was cancelled. This caused some Dragon people to ask whether the Dragon project would benefit by revisiting the choices of packaging, bus protocol, etc. to see if better alternatives could be found.
On May 23, 1986, a subcommittee of people working on the Dragon project was selected to propose a "straw man" Dragon computer which could be operational by June of 1987 and would meet our requirements for a first generation Dragon.
Purpose
The subcommittee decided that various individuals would go off and write proposals for various subsystems of Dragon, to serve as a starting point for further discussion. I was assigned to write about the display controller.
The purpose of this document is to propose a feasible plan for providing the June87 Dragon (we really need a better name) with a satisfactory display controller, and to discuss and analyze alternatives, to motivate the proposed alternative.
This document will be discussed by the aforementioned Dragon subcommittee, along with other proposals for other aspects of Dragon, and then the straw man proposal we were chartered to produce will be written.
References
[FCJH] Frank Crow and Jeff Hoel, The Dragon Display, CSL Notebook Entry 85CSLN-0002, May 14, 1985,
/Indigo/CSL-Notebook/Entries/85CSLN-0002.tioga
[JHRE] Jeff Hoel, Display Controller Real Estate, June 6, 1986,
/Nimitz/Cedar/Users/Hoel.PA/Dragon/June87/DisplayControllerRealEstate.tioga
[PS] Pradeep Sindhu, The DynaBus Logical Specifications, June 2, 1986,
/Indigo/Dragon/Documentation/DynaBusLSpecs.tioga.
Context
Before considering specific proposals for display controllers, perhaps it would be good to consider the goals of the Dragon project, both short-term and long-term. Then we can evaluate the various proposals in terms of how well they meet the goals of the project.
(By the way, I don't mean to suggest that the personal goals of the project members shouldn't be considered. But let's start by considering the goals we think we share or ought to share project-wide.)
By considering project goals, perhaps we can better understand the following:
o how the June87 Dragon straw man fits within the overall Dragon plan;
o how the June87 display controller straw man fits within the overall Dragon display plan; and
o how the June87 display controller straw man fits within the June87 Dragon straw man.
In the following sections, I've proposed some goals I think we probably share. Please have a look, to see if you agree. Please feel free to propose modifications, additions, and/or deletions.
Ultimate Dragon Goals
In the long run, here's what we hope the Dragon project will do:
o Provide a vehicle for software research into multiprocessing using a shared memory architecture.
o Provide a state-of-the-art computing environment for researchers within PARC. (Lots of CSL folks are depending on this.)
o Test and demonstrate the validity of the hardware architecture concepts. These concepts include the idea of snoopy caches; the DragOps instruction set; and the notion of multi-level busses, to name but a few.
o Provide the flexibility to integrate new hardware subsystems easily. We want to build both workstations and servers. We want to plug in things we haven't even thought of yet.
o Provide a forcing function for research into DA tools.
o Gain experience in VLSI design.
o Gain experience in other technologies appropriate for making state-of-the-art hardware systems.
o Transfer Dragon technology to the wider Xerox community.
June87 Dragon Goals
For the June87 Dragon, here's what we think we want to do:
o Minimize the risk of not getting done by June 1987. Actually, I'm not sure why it's June 1987 -- perhaps because it's close enough to instill the right kind of immediacy and urgency. What should we worry about if we don't make it? (It's been said that the difference between research and engineering is the level of acceptable risk. What are we doing here?)
o Provide a vehicle for Dragon software development. The Cedar port will be a big deal, and we won't have a state-of-the-art computing environment until it's done. So it's best to get started as soon as possible. This will require a substantial number of copies of the Dragon hardware. But how many? I've heard we might make as many as 100 June87 Dragons, but that sounds high if Cedar porters are the only customers. What other customers do we have in mind, if any, and what are their requirements? How soon after June 1987 do we intend to provide the next generation of Dragons? What then happens to the old ones? What kind of budget do we need to do that?
o Maximize the opportunity for subsequent generations of Dragons. We don't want to paint ourselves into a corner, by designing a dead-end June87 Dragon. For example, we're designing the bus protocol to work for multiple bus levels, although we'll propose implementing a single bus level for June87.
o Minimize "throw-away" engineering. In a way, this is a continuation of the previous point. After June87, what will have to be completely redesigned for the next generation? Functions within chips? Chips? Chip carriers? Boards? Backplane? Chassis? Or do we want to gain the experience and throw away all the artifacts? I'll bet there's no consensus yet.
o Maximize "spiffiness," while still minimizing risk.
o Demonstrate viability of the Dragon concept. I'm not sure this is really a goal, per se, but I thought I'd mention it. Should the June87 Dragon compare favorably with its competition? What will the competition be? What will the basis of comparison be?
o Live within budget. It seems prudent to assume that headcount is frozen indefinitely. How much can we spend for non-recurring engineering? How much can we spend per June87 Dragon? Can we spend money to reduce risk? To accelerate the schedule? To make something spiffy?
o Have fun. Actually, I'm not sure what to say here. There's a time to ****
<Must be fun?? -- some of us have been having fun longer than others....>
<community service or self-expression?>
Ultimate Display Controller Goals
In the long run, here's what we hope to do with display controllers for Dragons. Let me limit the list to features I think have system implications.
o Provide the flexibility to integrate new display controllers easily.
o Provide sufficient packaging real estate to do interesting things.
o Support multiple displays.
o Support highest quality displays. (CRT technology is likely to predominate for many years, although other technologies should not be precluded.)
- Both monochromatic and color displays.
- Ultra-high resolution, e.g., 2048 x 2048 pixels. Resolution should not be limited by the bandwidth of the main bus, which implies that the display ought not to have to depend on the main bus for refresh.
- Flicker-free image. This requires a refresh rate of at least 60 Hz, non-interlaced.
o Support up to at least 24 bits per pixel
o Have a memory mapped frame buffer. This architecture provides convenient, high-bandwidth access to the frame buffer for Dragon and other processors. In a research environment, it seems appropriate to rely on the general-purpose Dragon processors to provide the horsepower required for most graphics applications; but special-purpose processors should not be precluded. Within CSL the idea of supporting video over high-speed local area networks is popular, so perhaps a special-purpose local area network processor for video would make sense. In any case, a mechanism is required for assuring data consistency of the frame buffer.
June87 Display Controller Goals
For the June87 Dragon, here's what we think we need to do:
o Minimize the risk of not getting done by June 1987. This applies both to the display controller hardware and to the display head software.
o Provide a display adequate for software development.
June87 Pre Straw Men
At the Dragon Meeting of May 23, 1986, Jean Gastinel presented his ideas of what the 1987 Dragon straw man might turn out to be. (These ideas were also discussed at subsequent subcommittee meetings. I may be a little confused about what I heard where, but it shouldn't really matter.) The Dragon subcommittee was not required to accept any of what was presented as cast in stone. But Ed McCreight recommended that we consider it as a place to start. Let me call these ideas "pre straw men."
June87 Dragon Pre Straw Man
Here's a sketch of the features of the June87 Dragon pre straw man:
o High-speed bus: The main bus proposed has a raw bandwidth of 200 MB/s, five times as much as its predecessor, the M bus. It is 64 bits wide (not 32) and runs at 25 MHz (not 10 MHz). Due to bus protocol overhead, the maximum effective data bandwidth, assuming 100% bus utilization, is 114 MB/s. (The estimate of 25 MHz is said to be "conserative;" more optimistic estimates have ranged as high as 100 MHz.)
o One-board implementation: To achieve the proposed bus speed, it is necessary for the physical extent of the bus to be rather small. Putting the whole Dragon on one 10.9 x 16 inch board addresses that requirement. (An optional two-board system was discussed, but the main bus was confined to one of the two boards, and an extended memory system was proposed for the other.) A case can be made that a one-board is simpler to make than a multi-board system, since the backplane need only supply power.
o Chip carrier packaging: To fit everything on one board seems to require more exotic packaging technology than we had previously been considering. Rather than package chips individually in pin grid arrays, it was proposed that Dragon clusters (consisting of EU, IFU, and two caches -- no floating point) be mounted on 3 x 6 cm chip carrier modules. Also, it was proposed that memory be packaged as nine-chip hybrid SIPs, using 1M DRAMs.
o Processors: It was proposed to put four processor modules on the system board, since they will fit and won't swamp the bus.
o Chassis: It was proposed to use a 6085 (Daybreak) chassis, because it's done and is reasonably clever. The system board would be allowed to occupy two slots, since some of its components are rather tall.
June87 Dragon Display Controller Pre Straw Man
Here's a sketch of the features of the display controller proposed for the June87 Dragon pre straw man:
o Display: There need be only one display. Color would be awfully nice. Refreshed at 60 Hz, non-interlaced. Reasonably high resolution, by today's commercial standards: 1000 x 1000 pixels. Pixels can be 8, 4, 2, or 1 bits. There's a colormap.
o Screen Refresh: The frame buffer is kept in main memory, and accessed over the main bus, even for refreshing the screen. This solves the data consistency problem. But the display requires 105 MB/s of raw bus bandwidth -- more than half of the total available bus bandwidth. A case was made that this bandwidth can be had almost for free, provided the display controller is designed normally to use bus cycles not usable by other processors. To keep the average bus latency for processors small, the processor utilization of the available bus bandwidth should not exceed, say, 50%, so the other 50% is available for use by the display.
o Simple: The whole display controller consists of one custom chip, to inhale the pixels from the main bus, and an off the shelf colormap-and-DAC chip, plus incidentals. This is consistent with putting the whole Dragon system on one board.
Discussion of Pre Straw Men
June87 Dragon Pre Straw Man
Packaging
(Please excuse my intruding on the domain of the packaging subsubcommittee. I expect their report to be much more thorough than this is. I just wanted to be sure the following points were raised somewhere, because I think they affect the display controller.)
The packaging technology proposed for the June87 Dragon pre straw man is substantially more aggressive than what we had previously been planning to use in the Dragonfly-compatible days. It's neat stuff, of course. But do we need exotic packaging?
One result of using exotic packaging is that the main bus bandwidth is vastly increased. Let's compare exotic packaging with conventional packaging, with the main bus on a backplane and, say, four boards plugged into it. Let's assume the number of data wires and the bus protocol efficiency are fixed, so the performance differences are due only to bus cycle time. Suppose the conventional packaging's cycle time is between 80 ns and 100 ns. (I think that's conservative, but I don't have the evidence.) Suppose the exotic packaging's cycle time is between 40 ns and 20 ns. So the exotic packaging is between twice and five times as fast as conventional packaging. That's neat. But do we need it for the June87 Dragon?
The one real-time requirement placed on the main bus is the requirement of refreshing the display. (Let's assume the disk and ethernet are sufficiently buffered on the slow I/O bus side so that they don't present a real-time requirement to the main bus. Let's further assume that tasks running on Dragon processors don't have real-time requirements.) But it's possible to reduce the display's required bus bandwidth by a factor of eight by demoting it to a monochromatic binary pixel display. (For Dragons after June87, displays probably rely on frame buffer caching, so they don't require extreme bus bandwidths; but even if they did, the backplane bus, now a second-level bus, could have a bandwidth as high as a first-level bus with exotic packaging.)
In terms of the goals for the June87 Dragon, sacrificing a color display would not be "spiffy" and wouldn't be as much fun (perhaps). But it would be adequate for software development and it might reduce the risk a great deal.
The same is true of Dragon processors. They wouldn't be as "spiffy" if they were slower, but they would be adequate for software development.
Another reason to prefer exotic packaging technology is that it's more compact. It gives us the opportunity to make a one-board (or perhaps two-board) system. But do we need a one-board system for June87 Dragon?
The June87 Dragon pre straw man proposes to have four Dragon processors. Suppose, as a worst case, that conventional packaging could support only two processors. Would that be adequate for software development? How about three processors?
Exotic packaging is likely to be expensive. When we visited Raychem, we got a very rough estimate that each design iteration of a Dragon processor chip carrier might cost about $100,000. In terms of risk, the Raychem chip carrier program is still in the research phase. They won't have a production facility until late this year at the earliest, and Richard Bruce guessed they might not be in routine production for at least a year. To meet June87 schedules, we'd be sharing their research risk. Jean points out that conventional (thick-film ceramic) hybrid technology has been around for years, used mostly by the military. True, but I'm not sure how expensive it is.
Perhaps the goal of demonstrating the viability of the Dragon concept is a bigger deal than I had realized. For June87, who will the competition be and how will we have to compare with it? In particular, will we have to compare favorably in terms of speed and size? And will the competition be using exotic packaging? (It is rumored that DEC has just bought Trilogy.)
I guess there are no easy answers.
<Do we need "exotic" packaging? Why?>
<Do we need more bus bandwidth that conventional packaging can provide? Why?>
<(A factor of between 2x and 4x?)>
<Support display refresh -- but requirement could be more modest>
<both for June87 (fewer bits per pixel) and beyond (frame buffer caching)>
<Keeping Up With The Joneses -- who's the competition? (DEC bought Trilogy)>
<Do we need a one-board (or two-board) design?>
<Propose four-board design -- here or elsewhere?>
<Do we need a 6085 chassis?>
DA Tools
In a recent discussion, Jean and I considered one reasonable development schedule for getting a design done in a year:
Schematics:  3.5 months
Layout:   3.5 months
Fab & Test 1: 2.5 months
Fab & Test 2: 2.5 months
To meet such a schedule, it's clear that the DA tools to be used must be selected and provided to the environment very soon. A workable approach must be recommended for each of the following:
Custom ICs
Standard Cell ICs
Chip Carriers
Printed Circuit Boards
June87 Dragon Display Controller Pre Straw Man
The June87 Dragon Display Controller pre straw man is not at all "spiffy," which
Display
A single display is proposed. That's fine, given that the goal is software development. But is a color display needed or even desirable? Many people think that for reading text, a monochromatic display causes less eye fatigue. I'm not sure whether that's because a monochromatic display assures that garish colors are not used or because the color tube's shadowmask limits effective screen resolution. Some folks who saw Versatec's 1280 x 1024-pixel color display, refreshed at 60 Hz, had the impression that eye fatigue wouldn't be a problem. I guess it's no big deal to design for color and then plug in monochromatic at the last minute. But it might entail somewhat less risk to design for monochromatic from the beginning.
The proposed display resolution is 1000 x 1000 pixels. (How about 1280 x 960 pixels instead? Some people like the 4:3 aspect ratio, which is said to conform to available monitor shapes.) The current LF display is 1024 x 800 pixels; this, I feel, should be regarded as an absolute minimum for June87. (Since we know how to do resolution-independent software, I guess "spiffiness" is the main reason. Of course, lower resolution would be more painful to use for software development -- or anything else.)
The proposed refresh rate, 60 Hz non-interlaced, is fine. To do less than this would be an embarrassment.
Supporting 8, 4, 2, and 1-bit pixels seems fine. Beyond June87, extension to 24 bits is an open question. Perhaps the extension should be to 32 bits. (Can we say memory is cheap enough to waste 8 bits out of every 32? Can we find a use of them?) Alternatively, perhaps 24 bits should be divided into three groups of 8 bits, with each group stored separately. Anyhow, I guess the scheme proposed for June87 doesn't preclude different schemes for subsequent generations.
Screen Refresh
Refreshing the display over the main bus is fine, if the bus can support it. It obviates having to manage a local frame buffer memory, so it makes the display controller simpler and reduces risk. I'm not convinced it points the way to future generations, but perhaps it doesn't have to.
If the display is 1280 x 960 8-bit pixels, refreshed at 60 Hz non-interlaced, the required raw bus bandwidth is 130 MB/s (not the proposed 105 MB/s).
The proposal hopes that the display's required bus bandwidth can be had almost for free, because the display controller will normally use bus cycles no other processor can use. To meet this requirement, the display controller must be able to request bus transfers at a priority lower than that of any other processor. But to guarantee that the screen is refreshed in real time, the display must also be able to request bus transfers at a priority higher than that of any other processor. The display processor must keep a queue of pixels to be displayed. Whenever the queue is fuller than a certain critical amount, it makes low priority requests, but whenever the queue is less full than the critical amount, it makes high priority requests. (This priority scheme must be implemented by the arbiter. It it different from the one proposed in Pradeep's DynaBus spec [PS], which I understand was going to be revised anyhow. Multi-priority requests can be implemented with multiple request wires or with a single wire, presenting a sequence of bits to the arbiter. Details later.)
The critical amount is directly related to the maximum number of bus cycles which can elapse between the time the display controller asks for a transfer at high priority until the data is received. One estimate of the critical amount follows. (You may not agree with the numbers, but it doesnn't matter; the idea is the main point. I'm assuming 40-ns cycles.)
Completion of current request:  360 ns  9
Main memory refresh:     200 ns  5
Request:          2
Memory latency:      360 ns  9
Reply:           6
Misc Pipelining:         2
             
TOTAL           28 cycles
Whenever the pixel queue is fuller than the critical amount, the display can be a good citizen and make low priority requests. Clearly, the larger the queue the better, but we really don't have any statistics to say how much better. The question is how badly "bursty" bus traffic from other processors deviates from average traffic. For example, when a Dragon processor switches tasks, it's likely that the caches contain nothing of relevance to the new task, so the processor will present a heavy load to the bus until its working sets are established in caches. (A single Dragon processor, running at 5% miss rate, is said to require 16 MB/s, so at a 100% miss rate, it can swamp the bus.) How long after such a task switch must the display controller wait to find a free bus cycle? We don't know. If we assume just one Dragon processor on the bus, filling its caches at top speed until full, then the display controller might like to have a queue with as much capacity as two Dragon caches. That seems to me an impractically large size. More realistically, there are multiple processors on the bus, which makes things even worse, but the peak cache activity profiles are likely to be less severe. Pradeep has volunteered to do some simulations. I suspect that we'll have to take what we get in any case. That is, if the display controller turns out not to be such a good citizen on average, then so be it.
By the way, implementing a huge pixel queue in custom silicon seems to me out of balance. It costs a lot, and all you get is a reduction in bus latency for Dragon processors.
Also by the way, there has been some talk of "improving" the bus protocol for future generations of Dragons by allowing a higher priority request to truncate the current packet to gain access to the bus. I suppose that this would complicate life for the sender and receiver of the truncated packet. The purpose seems to be to reduce latency of high priority things. Why are we so concerned about doing this?
Area
The pre straw man proposes to put the display controller on a small corner of the one printed circuit board. I have the feeling that this approach is risky.
Last week I prepared an estimate of area required for the pre straw man display controller [JHRE]. Although conceptually the design is just a custom chip plus a DAC chip plus incidentals, I counted upwards of 30 components altogether! The footprint area alone was estimated at about 13.5 square inches, which I doubled, to 27 square inches, to account for the fact that the footprints can't be densely packed into a given rectangle. Printed circuit routing requirements were not considered. The display controller has to connect to the main bus and the slow I/O bus, and perhaps also a remote video oscillator. The coax video connectors must be on the edge of the board. These constraints affect whether or not the display controller can effectively use any given shape of printed circuit board real estate. The component list is absolutely guaranteed to change. (For example, should I be using Alfred's bus interface chips? Probably so, but I assumed otherwise for the estimate.) What we really need, to verify that the design will fit in a given area, is to do a detailed layout, not only of the display controller but of the whole board. Do we really want to commit to a one-board design before doing this? I have the further concern that renegotiating space among subsystems could turn out to consume a lot of time and energy and might result in compromises nobody really wants.
There has been some talk of perhaps putting the dispay contoller on a chip carrier module too, perhaps with some other subsystem. I think mixing in another subsystem at that level is asking for trouble. What's our non-recurring engineering budget for chip carrier modules? How many different types can we afford?
Miscellaneous
The idea of limiting the design to using at most one custom chip is prudent under the circumstances.
It has been suggested that a hardware cursor would be nice. Perhaps so, but I wouldn't feel too badly about leaving it out, given that the goal is software development. The current display software, dealing with the Dorado color display, already knows how to do software cursors. (Sure it's a pain, but so what?) Display controllers for generations beyond June87 may have entirely different ideas about how to implement things like cursors, so it's not clear a June87 implementation of hardware cursors would be of lasting value.
Discussion of Other Alternatives
Future Dragon Display Controller
Here's a list of some of the features a future Dragon display controller might have. The list is limited to features that affect system integration.
Packaged on its own printed circuit board: This will be desirable if we want to include a local frame buffer with the display controller, which I think is inevitable. The area could be used for all sorts of other things too. This plan provides the flexibility required to plug in new display controllers as they are developed.
Attaches to second-level main bus: This bus, because it is short (say five connections?) is said to be potentially as fast as first-level busses.
Has its own frame buffer: Ultra-high resolution displays require more bandwidth for screen refresh than is reasonable to expect from the fastest main busses imaginable. But we want to retain a memory mapped architecture, so that Dragon processors can treat the frame buffer as main memory. What is needed is a mechanism for assuring data consistency. We imagine that the display controller works something like a secondary cache, in that it maintains a valid bit for each oct of frame buffer memory. For example, for a 32 MB frame buffer, 1 Mb of valid bits would needed. Unlike the secondary cache, the valid bits must be stored externally in static RAM chips. This raises a concern that the display controller be as fast as the secondary cache in responding to bus packets that affect valid bits; we assume that a satisfactory solution can be found. According to the present bus protocol, after a Dragon processor reads an oct from memory, it is obliged to issue a write single when it first writes any word of the oct, so that a secondary cache (or frame buffer) can mark the oct invalid. When an oct is written to the frame buffer, its valid bit can be set. When the display processor does screen refresh from the local frame buffer, it requests of the main bus only the octs which are marked invalid. This scheme offers a viable alternative to the previous plan, flushing and laundering caches, which has fallen into disfavor. By the way, unlike a secondary cache, the frame buffer memory needn't be backed up by an equal amount of main memory.
Previous Display Controller Proposal
Perhaps it would be interesting to compare the current display processor proposals with a proposal made last year, described in CSL Notebook Entry 85CSLN-0002 [FCJH]. That proposal was definitely more ambitious than now seems appropriate for June87.
The central concept of this proposal was that it could do "thing composition on the fly." That is, it could compose each scan line, in real time, from a set of rectangular pixelmaps stored in the frame buffer, as directed by a display list. The composition was performed by painting pixel spans into the scan line in back-to-front order, with pixels in front overwriting pixels behind; however, pixels were permitted to have a "transparent" value. This paradigm served as an implementation of cursors, overlays, and multiple windows.
The proposed packaging was an entire printed circuit board. (Actually, the proposed 24-bit implementation required three such boards. Details of inter-board synchronization were not entirely worked out.)
There was a local frame buffer, implemented with dual port dynamic RAMs. An internal bus, 256 bits wide and operating at 20 MHz, connected the DRAMs' second port to the pixel composer. This large bandwidth was thought to be necessary for "thing composition on the fly," because the pixel painting rate can be many times the screen refresh rate. The system relied on flushing and laundering caches to assure data consistency.
A double-buffered scan line (of, say, 1280 pixels each -- the details were never really frozen) served not only as a place to compose scan lines but as a pixel output queue. The pixel painter painted one scan line while the pixel reader asynchronously output the other scan line to the screen. Once per scan line, during horizontal retrace, resynchronization occurred so that the painter and the reader could swap scan lines. In this way, the memory for scan lines could be single port, not dual port, as is common for FIFOs. To handle the large number of pins of the internal bus (256) and the silicon complexity of the double-buffered scan line, the composition function was split among four identical chips, each processing two bits of each eight-bit pixel.
June87 Dragon Alternative
I'd like the Dragon subcommittee to consider the possibility of proposing for the June87 Dragon a four-board system (processor, memory, I/O, display) that uses conventional printed circuit board technology and pin grid arrays of the kind already developed by Bill Gunning.
I'm not saying I'm convinced it's the best alternative. But I think it has the advantage of having lower risk than the June87 Dragon pre straw man. (If I'm wrong, convince me.) We might decide that such a system wouldn't meet the goals for a June87 Dragon. But then I'd like to understand better what those goals are.
As discussed in a previous section (Discussion of Pre Straw Men), such as system would be much slower than the pre straw man.
June87 Dragon Display Controller Alternative A
Following from the previous section, I'd like the Dragon subcommittee to consider putting the display subsystem on its own printed circuit board, interfacing to the main bus on the backplane via a conventional two-piece edge connector. Ideally, I'd like to see the slow I/O bus on the backplane too, since that's the proposed vehicle for loading the colormap.
For a June87 display controller, I'd propose to implement only the functionality proposed by the pre straw man. That would mean that most of the printed circuit board would be wasted. A 10.9 x 16 inch printed circuit board has about 175 square inches of gross area, before subtracting forbidden areas around the edges. By one aggressive estimate, a display controller might need only 27 square inches. But so what? Printed circuit board real estated can be relatively inexpensive.
Anyhow, if the real estate is there, we might find a use for it.
June87 Dragon Display Controller Alternative B
One of the things that worries me about putting the bus interface and the DAC interface on the same custom chip is that the two interfaces are asynchronous. Testing such a part could be a real problem. Discussions with Bill Gunning have reminded me of how much respect asynchronous interfaces should be given. He points out, for example, that the theoretical formula for estimating the probability of metastability (inability to get a digital result when sampling an asynchronous signal) fails to take into account physical factors. For example, an event on the bus side, say, could cause the power distribution wires to bounce, which could affect the timing of the DAC side in such a way that a metastable event is much more likely than one might suppose.
Given that printed circuit board real estate is cheap, I had considered the possibility of putting the pixel queue into off-the-shelf FIFO parts. Given that state-of-the-art CMOS FIFOs run at 35 MHz maximum, a width of 64 is apparently needed to keep up with the bus. On the DAC side, this must be multiplexed down to 32 wires. (The scheme had somewhat more appeal when I mistakenly thought a 32-bit FIFO would be wide enough.) Anyhow, since such a FIFO would have a capacity of only, say, 512 8-bit pixels, its ability to outlast the "bursty" bus traffic due to Dragon processors and hence to avoid making high priority bus requests might be rather limited. Before the simulations are in, it's hard to be sure. (There's also a detail of knowing when the FIFO is critically full to be worked out.) In the end, I'm not sure I like the idea of actually using commercially available FIFOs, but I think I like the idea of having the printed circuit board real estate available to think about it.
Similarly, if there were space available, it might make sense to implement the display timing generator outside the custom chip. One approach that seems interesting is to use a Xilinx Logic Cell Array, sort of a programmable logic device with downloadable static RAM "fuses." There's a 33 MHz version, on paper at least, which is adequate. I don't yet know whether it's even feasible, let alone desirable.
June87 Dragon Display Controller Alternative C
If building a custom chip with an asynchronous interface seems risky and using off-chip standard FIFOs is undesirable, then what about defining the bus clock and pixel clock to be synchronously related? To match the proposed 40 ns bus clock, we might choose a 10 ns pixel clock. (For a 1280 x 960 pixel display, refreshed at 60 Hz, a 10.9 ns pixel would be better, but a 10 ns pixel would work.)
The difficulty is that over time we may wish to change the speed of one of these clocks without changing the other.
June87 Dragon Display Controller Alternative D
For the June87 Dragon, the possibility of purchasing a commercially available display controller, with a standard bus interface (e.g., VME, Multibus, etc.), deserves to be mentioned. This is the approach we always used to say we would take. (The latest such candidate was the Drgonfly's monochromatic display.) This approach has the advantage that the "real" display can be developed on its own schedule without impacting the schedule of the rest of the machine.
As I understand it, the June87 Dragon will have a slow I/O bus of some kind implemented, but we don't yet know what kind. Does this mean that it will have a standard bus interface such as VME or Multibus?
One problem with this alternative is that it will not a memory mapped architecture. Rather, it will have its own local frame buffer, which must be written as if it were I/O. So it will probably be more work to convert existing software to use such a display. This software work is not to be taken lightly, as it also has an impact on the overall Dragon schedule.
The packaging of such a display controller withing the June87 Dragon is another problem.
Final Remarks
What display controller to propose for the June87 Dragon depends on lots of factors, which the subcommittee should further consider. It is important to understand the goals for June87 Dragon. And having understood them, I think it's important to write them down.
If we decide to implement a one-board solution, using exotic packaging and allocating the display controller about 30 square inches of real estate, then it's hard to see an option other than the one proposed by the pre straw man. Unless we were willing to settle for a commercially available display (Alternative D above).
A minimalist approach ought to suffice. I put a hardware cursor at low priority. It might turn out that we want a monochromatic display, but for now I like the idea of a color DAC chip, with its entourage of support components (as an area placeholder if nothing else).
The asynchronous nature of the proposed custom continues to make me nervous.
I'd definitely like some simulation data to support a choice of FIFO size.
If we decide to implement a four-board solution, using more or less conventional packaging, then I think the risk of making the display controller will be reduced substantially. Again, I don't know if that makes sense in terms of the goals for the June87 Dragon.
I'd be very interested in seeing if we can agree on a plan for interfacing future display controllers.