DYNABUS IMPLEMENTATION GUIDELINES
DYNABUS IMPLEMENTATION GUIDELINES
DYNABUS IMPLEMENTATION GUIDELINES
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY
DynaBus Implementation Guidelines
DynaBus for the June87 hardware designer
Jean-Marc Frailong
Dragon-86-xx Written October 15, 1986 Revised October 21, 1986
© Copyright 1986 Xerox Corporation. All rights reserved.
Abstract: This document is intended as an explanation of some of the obscure points of the DynaBus from the point of view of the implementor of a new VLSI connected to it. It does not in any way replace the truth contained in the specification documents. It also contains some information specific to the implementation of the June87 machine.
Keywords: VLSI, DynaBus, DBus
FileName: DynaBusGuidelines.tioga, .interpress
XEROX  Xerox Corporation
   Palo Alto Research Center
   3333 Coyote Hill Road
   Palo Alto, California 94304



Dragon Project - For Internal Xerox Use Only
Contents
1. DynaBus subsets
2. Interrupts on DynaBus
3. Protection issues
4. Error management
5. DBus support
References
ChangeLog
1. DynaBus subsets
The DynaBus [1] offers a rich set of transactions, resulting in a complex implementation for devices that need to use its full capabilities. Fortunately, it is quite easy to divide DynaBus usage into subsets, each particular device using only some of them.
1.1. Subset definition
Each subset is headed with a list describing which transactions it may initiate, which packets (in addition to replies to its requests) it should receive, and what replies it may emit as a result of reception or snooping. Real devices use a combination of subsets.
- Cache (consistency protocol):
Sends: ReadBlock, WriteBlock, WriteSingle, ConditionalWriteSingle, Map
Receives: RBRqst, WSRply, CSRply, RBRplyCorr
Replies: RBRply, RBRplyCorr
This is the most complex subset, as caches are concerned with the consistency part of the protocol. The complexity originates in having multiple consistent copies of the same information. The only chip planned that will support this protocol is the Small Cache. The SloBridge will also support it, but makes it easy by having a copy of the Small Cache (albeit with a reduced number of entries) inside it. This approach is strongly recommended if we need to have other chips that need to behave as caches. A new, correct implementation of this protocol would be costly and difficult. It should be noticed that the Relay subset contains the Cache subset.
- Memory (information storage):
Receives: RBRqst, WBRqst
Replies: RBRply, WBRply
This subset is fairly simple. It is concerned only with WriteBlock and ReadBlock. It behaves as a pure slave. The only complexity is to support the shared/owner lines to some extent.
- Consumer:
Sends: ReadBlock
Receives: WSRply, CSRply, RBRplyCorr (all optional)
This subset is for devices that need to read information from memory, but never write. It is very simple, as it is only concerned with ReadBlock. If the device cares about getting completely consistent data, it must also make use of WSRply, CSRply, RBRplyCorr. This might be useful if the consumer implements an internal read-only cache tracking DynaBus transactions (this is the way an advanced display controller would work).
- Producer:
Sends: ReadBlock, WriteBlock, WriteSingle, ConditionalWriteSingle (optional)
Receives: RBRqst, WSRply, CSRply, RBRplyCorr
Replies: RBRply, RBRplyCorr
This subset is for devices that need to write information into memory. The current specification of the DynaBus imply that such devices must behave exactly as caches (even if they have only one entry) except for Map transactions because it is impossible to use WriteBlock to write fresh data and respect the consistency scheme. Implementing a special flavor of WriteBlock that would update cache entries is complex due to the output FIFOs in caches that would need an invalidation mechanism. Future updates to the DynaBus might nevertheless add such a transaction to improve the efficiency of producers.
- Relay:
Sends: to be defined
Receives: to be defined
Replies: to be defined
This subset is for devices that connect between two levels of DynaBus in the hierarchy. The subset is fairly complex (such devices must behave as memory and terminator for the lower level, as caches for the upper level), but the DynaBus specification was designed so that it would be supported. There is no plan to have such a device in the June87 machine, which uses a one-level DynaBus.
- Terminator:
Receives: WSRqst, CSRqst, BIOWRqst
Replies: WSRply, CSRply, BIOWRply
This subset is used by a single device sitting at the highest level of the DynaBus to reflect system-wide requests back synchronously as equivalent replies (logically speaking) on all DynaBus devices and to provide a dummy reply for broadcast transactions. In the current (June87) implementation, this is made part of the task of the memory controller, but it really constitutes a separate function.
- I/O slave:
Sends: IOWrite, BIOWrite (both optional, to send interrupts)
Receives: IORRqst, IOWRqst, BIOWRqst
Replies: IORRply, IOWRply
This subset is used by devices supporting DynaBus IO requests (IORead, IOWrite, BIOWrite). It is quite straightforward. Most devices need to support this subset, for example to send interrupts to notify of error conditions or to provide some status information.
- I/O master:
Sends: IORead, IOWrite, BIOWrite
This subset is used by devices emitting DynaBus IO requests (IORead, IOWrite, BIOWrite). It is quite straightforward.
- Mapping:
Receives: MapRqst
Replies: MapRply
This subset is used by the map cache. It is conceptually similar to the I/O slave subset (for IORRqst), but uses a special transaction to help the bus relay function.
1.2 Subset usage
The following table indicates the use of subsets by the chips designed for the June87 machine.
Small Memory Display Slo Map
Subset Cache Ctlr Ctlr Bridge Cache
Cache YES NO NO YES NO
Memory NO YES NO NO NO
Consumer NO NO YES NO NO
Producer NO NO NO NO NO
Relay NO NO NO NO NO
Terminator NO YES NO NO NO
I/O slave YES YES YES YES YES
I/O master YES NO NO YES NO
Mapping NO YES NO NO YES
The following table indicates the use of subsets by possible future chips.
Big Fast
Subset Cache Network
Cache YES NO
Memory NO NO
Consumer NO YES
Producer NO YES
Relay YES NO
Terminator NO NO
I/O slave YES YES
I/O master YES NO
Mapping NO NO
2. Interrupts on DynaBus
Devices may need to send an interrupt to a processor sitting on the DynaBus. Although there is no specific Interrupt transaction on the DynaBus, interrupts may be sent over the DynaBus. The technique used is to send an IOWrite transaction to a processor cache. This technique has the disadvantage that knowledge of some cache IO registers appears in the design of DynaBus devices. The advantage is that it permits to have both point-to-point and broadcast interrupts with no additional protocol complexity. Future versions of the DynaBus may introduce an Interrupt transaction to remove the dependencies.
2.1 Interrupt model
Interrupt processing is supported through I/O locations in the Small Cache. A Small Cache contains a 32-bit register InterruptStatus containing the current interrupt status. DynaBus devices may OR any value into this register. The corresponding processor will receive an interrupt if this word is non-zero. Each bit is assumed to represent a different "source" of interrupt (i.e. device or group of devices). The processor will read this register and treat each bit set as representing a device requiring some form of service. The reader is referred to [2] for more details.
The support provided by the Small cache is thus minimal: it only collects information supplied by DynaBus devices. There is no support on the DynaBus to acknowledge that an interrupt has been processed by the processor. All the processor can do is remove (ANDing into) bits from the cache's InterruptStatus register.
2.2 How a DynaBus device should generate interrupts
Interrupts have two parameters:
- what processor(s) must receive the interrupt
- what bit(s) should be set in the interrupt status register
Devices sending interrupts must have internal register(s) specifying these parameters. The preferred implementation is to have a 16-bit register InterruptAction writable from the DynaBus containing (MSB first) a 5 bit reason code (encoding one bit position among the 32 of the InterruptStatus register), a 1-bit Broadcast flag (1 if all processors must receive the interrupt, 0 if a single processor must receive it), and a 10-bit DynaBus DeviceID of the processor that must receive the interrupt (must be all 1's when broadcast). Devices may choose to have a second 32-bit register to represent the value to be ORed into the InterruptStatus register instead of the encoded 5 bit reason code. Devices should try to use only one bit in InterruptStatus if possible. If multiple bits need to be used (to represent very different interrupts sources), all of them should be independently programmable.
To send an interrupt, a device must generate an IOWrite (BIOWrite if the Broadcast bit is set). The address for the IOWrite must be of the form (32 LSB, other bits must be 0):
0000 0000 0001 PPPP PPPP PP00 XXXX XXXX
where PPPPPPPPPP is the 10-bit DeviceID in InterruptAction and XXXXXXXX the number of the InterruptStatus register inside the Small Cache (the exact value of XXXXXXXX is not yet determined). This address might be stored by the device in a programmable register for additional flexibility. The top 12 bits of this address (001H) denote the small cache DevType on the DynaBus.
2.3 Other interrupt related issues
Each device needs to have an InterruptSource register containing information on the source of the interrupt. The preferred implementation is to use one bit per possible source. When a processor receives an interrupt, it will search for posted interrupt reasons in the InterruptStatus register, and will read the InterruptSource registers of the interrupting devices (the mapping between interrupt reason number and devices is under software control, thanks to the InterruptAction registers). It will then try to do whatever is needed to remove the source of interrupt (for example reading some data that has been received by the device). The processor must be able to clear individual bits in the device's InterruptSource register either explicitly or by performing some other action on the device.
It is recommended that a device send a new interrupt transaction each time a new interrupt source requires service. Sending one interrupt only for multiple sources is also possible, but some care must be exercised so that no interrupts are lost.
The reader is once again referred to [2] for more details on how the software manages interrupts and what impact this may have on the interrupting device.
3. DynaBus and protection
Protection refers to the ability to prevent user code (as opposed to kernel code) from using certain system resources. This applies to memory access and to IO access. Memory access is protected by the Small Cache through protection bits included in the page descriptors. It does not directly involve the DynaBus, and hence will not be described here.
3.1. IO slave protection
IO requests are not checked by the cache for protection. The solution that has been chosen is to defer protection to the device that will reply to the IO request. DynaBus requests carry a Mode/Error bit in the header. This bit will be set to 0 if the request originated from a non-privileged requester (typically a processor running in user mode), to 1 if the request originated from a privileged requester (typically a processor running in kernel mode or a non-processor device).
Each device must decide what IO locations need to be protected against non-privileged requesters. The protection may further be different for IORead and IOWrite (it is expected, but not required, that IOWrite permitted implies IORead permitted for the same address). If the IO access is protected, the device must set the Mode/Error bit to 1 in the reply packet (c.f. section 4) to deny access to the requester, and should not perform any action on the addresses IO location. The second word of the reply packet must indicate the error reason as explained in section 4.3.
It must be noticed that BIOWrites will never result in protection error, since the reply is sent by the DynaBus terminator instead of the addressed device(s). Such requests will just be ineffective for all addressed devices. This peculiar behavior of BIOWrites is not specific to protection, but is a side effect of the fact that devices may never reply directly to broadcasted requests.
3.2. Protection and non-processor requesters
When a non-processor makes a DynaBus request, it should always set the Mode/Error to 1 to acquire kernel privilege, unless it is working as a result of some action initiated by a processor running in user mode (there is no such case in the current June87 design). It is safe for non-processor devices to just hardwire the Mode/Error bit to 1 for all request packets on the DynaBus.
The action to be taken by non-processors when they receive a reply packet with the Mode/Error bit set is further detailed in section 4.
4. Error management on the DynaBus
The DynaBus offers multiple ways to signal errors. Usage depends upon the error level and upon the circumstances of the error.
4.1. Error signaling
The DynaBus offers three different types of error signals:
- Raising the ErrorOut line on the DynaBus. This is reserved to errors unrecoverable through software or originating from a deadly hardware malfunction. It will stop the whole system and wake up the maintenance processor for a post-mortem analysis of the machine state.
- Setting the Mode/Error bit to 1 in a reply packet. This is applicable only when the error occurs because of invalid information in a DynaBus request, such as protection error (c.f. section 3) or map cache fault. It is the requester's responsibility to take appropriate action (c.f. section 4.2).
- Sending an interrupt to adequate processor(s). This should be used for all errors that do not fall in the previous two categories.
DynaBus devices should use the signaling method most appropriate to the error.
4.2. Error causes
A device may want to signal an error for various causes, such as:
- it is in an unexpected state: ErrorOut if fatal, interrupt if recoverable
- it is required to do something illegal or meaningless: reply in error or interrupt
- some other devices signals an error to it: ErrorOut or send an interrupt to a processor
In any case, the device should provide some information on the error, either in some internal register for ErrorOut or interrupt, or in the reply packet for reply in error as outlined below.
All devices sending DynaBus requests should have a time-out mechanism for the reply. If the timeout expires before the reply arrives, the device should signal an error. The method used depends on the device. Using ErrorOut is not recommended, unless the lack of reply endangers system survival (such as no reply when sending an interrupt...). Most often, an interrupt should be posted. The timeout mechanism does not need to be precise as to the time the error is caught, but all requests that do not get their reply within some time must result in an error.
4.3. Managing the Mode/Error bit in reply headers
To signal an error directly linked to a request, a device should set the Mode/Error bit to 1 in the reply header (the most usual case is protection as explained in section 3). The following word of the reply packet contains information on the error. It should be encoded in the following way (upper 32 bits of data are 0):
IIII IIII IIDD DDDD DDDD DDDD DDDD DFFF
where IIIIIIIIII is the offended 10-bit DeviceID, FFF a 3-bit FaultCode and D designates device-dependent bits of the designer's choice providing enough information so that software may recover from the error. A suggested encoding of the 3 bit FaultCode field is (this encoding is not yet definitive):
000 Reserved
001 MapCache fault (issued by MapCache only)
010 Memory Protection fault (from small cache to its processor only)
011 IO Protection fault (c.f. 3.1)
100 Device Dependent fault
101 Reserved
110 Reserved
111 DynaBus TimeOut fault (from small cache to its processor only)
This table shows that normal DynaBus devices should use FaultCodes 3 or 4 only.
The action to be taken by a device receiving a reply with Mode/Error=1 depends on the device. Processors (in fact small caches) will store the 32-bit explanation, take a vectored trap indexed by the FaultCode, and provide the full 32-bit explanation word to the software trap handler that will process the error. Non-processor devices may either raise an interrupt if receiving an error is a reasonable possibility (for example, a sophisticated IO device might use the map cache to work with virtual addresses and subcontract mapping faults to a processor), or raise ErrorOut if such an error is unexpected.
As mentioned in 3.2, non-processor requesters should probably hardwire the Mode/Error bit to 1 in request headers to get kernel privileges.
5. DynaBus IO support
Most DynaBus devices will have to support DynaBus IO transactions, as noted in section 1.2. A few guidelines apply.
5.1. IO address decoding
As mentioned in the DynaBus specification, IO addresses consist of a DevType, DevNum and DevOffset field. Each device should be selected by full decoding of the (DevType, DevNum) fields for IORead and IOWrite and by decoding of DevType only for BIOWrite. Devices may (but should avoid) require that BIOWrite also have a specific DevNum (such as all 1's). The DevNum should be derived from the DynaBus DeviceID in a non-ambiguous manner: devices using only a small address space should use exactly the full DynaBus DeviceID. Devices using medium and/or large address spaces should use the low-order bits of their DeviceID. It will be the responsibility of the maintenance processor to assign DeviceIDs at bootstrap time in such a way that the DevNum are non-ambiguous within a device type.
5.2. Reading and writing
It is strongly recommended that IO registers that may be written from the DynaBus may also be read from the DynaBus. Although it is possible to maintain shadow copies in software, it usually makes debugging and fault analysis more difficult.
Registers accessible through DynaBus IOWrite should also be accessible through DynaBus BIOWrite. If the decoding strategy outlined in the previous section is used, this is no problem.
5.3. Support of multiple simultaneous IOs
Since there are multiple processors, multiple IO requests may be sent simultaneously (i.e. in between request and reply) to a given device. It is recommended that devices have an internal IO request queue to support such cases. An alternate possibility is to trust a software lock and just drop on the floor any IO request that comes in the middle of another one (thus causing the requester to end up in timeout). This solution should only be used sparingly for devices that expect a very low rate of IO, such as the display controller.
6. Interaction with DBus
The DBus interferes with devices on two occasions: at bootstrap time to setup some internal registers and when the system fails to analyze the origin of the failure. The hardware interface with the DBus is described in detail in [3].
6.1 DBus and the standard cell methodology
Most of the DBus functionality for standard cell designs will be provided through pre-defined blocks of logic in the DBusLogic design file [5]. This will include the DBus interface and control logic, chip identification and DynaBus DeviceID. Standard flip-flops and memory arrays of the Logic library [4] will support LSSD automatically, and special flip-flops will be available to support special logic that need to be kept out of the scan path.
A design may be part standard cell, part custom logic. In that case, the designer should use the signals generated by the DBus interface to provide some LSSD facility for the custom logic blocks.
6.2. DBus and initialization
At initialization, the DBus is used to read the chip identification of each device in the system, from which it derives the system configuration and/or checks it against a presumed configuration. Chips using the standard-cell approach will contain a predefined block of logic implementing this function (details are yet to be defined). The chip identification is specific to each implementation (set of VLSI masks). Refer to [3] for more detailed information.
The DBus is also used to setup the DynaBus DeviceID of all DynaBus devices. In order to simplify the bootstrap process, the DeviceID register should be at the same location in the scan path for all devices (e.g. the first 10 bits, excluding chip identification). This should also be supported more or less automatically for standard-cell designs. This constraint permits to minimize the amount of information that must be known at bootstrap time by the debug processor (only a chip identification to scan-path length table).
Normally, devices should start full operation as soon as the DBus nDFreeze signal is rescinded. Some devices may find it necessary to keep certain parts of the chip inoperative until some software action has been performed (delayed reset), such as caches to prevent processors from starting too early in the bootstrap process, or the display controller to prevent display generation before proper initialization. In that case, this delayed reset should be obtained by an internal flip-flop.
6.3 DBus and debugging
The DBus implementation suggested in [3] provides all the functionality necessary to debug standard-cell designs. The design should such that storage elements that are not part of the scan path, such as RAM and FIFO, are nevertheless accessible indirectly through the DBus using the Execute function to single-step the chip. It may be necessary to remove certain flip-flops from the scan path if modifying them would result in undesirable side-effects, but this should be exceptional.
References
[1] Sindhu: "DynaBus Logical Specifications", in [Indigo]<Dragon>DynaBus>DynaBusLogicalSpecifications.tioga, .interpress
[2] Frailong, Sindhu, Atkinson: "Dragon Interrupt Management", in [Indigo]<Dragon>Interrupts>InterruptsDoc.tioga, .interpress
[3] Frailong: "The Dragon DBus", in [Indigo]<Dragon>DBus>DBusDoc.tioga, .interpress
[4] Monier: "Logic Library", in [DATools]<DATools6.1>CellLibraries>LogicDoc.tioga, Logic.dale
[5] ??? : "DBus macro cells", in [Indigo]<Dragon>DBus>DBusLogic.tioga, .interpress, .dale
ChangeLog
Jean-Marc Frailong October 21, 1986 5:55:56 pm PDT
Editorial modifications after proofreading (thanks to PS & JH)
changes to: most of the document, added References section