[_CD4_]<dragon7.0>DynaBus>DynaBusGuidelines.tioga!1

DYNABUS IMPLEMENTATION GUIDELINES

DRAGON PROJECT — FOR INTERNAL XEROX USE ONLY

DynaBus Implementation Guidelines

DynaBus for the June87 hardware designer

Jean-Marc Frailong

Dragon-86-xx Written October 15, 1986 Revised October 21, 1986

Abstract: This document is intended as an explanation of some of the obscure points of the DynaBus from the point of view of the implementor of a new VLSI connected to it. It does not in any way replace the truth contained in the specification documents. It also contains some information specific to the implementation of the June87 machine.

Keywords: VLSI, DynaBus, DBus

FileName: DynaBusGuidelines.tioga, .interpress

XEROX Xerox Corporation
Palo Alto Research Center
3333 Coyote Hill Road
Palo Alto, California 94304

Dragon Project - For Internal Xerox Use Only

Contents

1. DynaBus subsets

2. Interrupts on DynaBus

3. Protection issues

4. Error management

5. DBus support

References

ChangeLog

1. DynaBus subsets

The DynaBus [1] offers a rich set of transactions, resulting in a complex implementation for devices that need to use its full capabilities. Fortunately, it is quite easy to divide DynaBus usage into subsets, each particular device using only some of them.

1.1. Subset definition

Each subset is headed with a list describing which transactions it may initiate, which packets (in addition to replies to its requests) it should receive, and what replies it may emit as a result of reception or snooping. Real devices use a combination of subsets.

- Cache (consistency protocol):

Sends: ReadBlock, WriteBlock, WriteSingle, ConditionalWriteSingle, Map

Receives: RBRqst, WSRply, CSRply, RBRplyCorr

Replies: RBRply, RBRplyCorr

This is the most complex subset, as caches are concerned with the consistency part of the protocol. The complexity originates in having multiple consistent copies of the same information. The only chip planned that will support this protocol is the Small Cache. The SloBridge will also support it, but makes it easy by having a copy of the Small Cache (albeit with a reduced number of entries) inside it. This approach is strongly recommended if we need to have other chips that need to behave as caches. A new, correct implementation of this protocol would be costly and difficult. It should be noticed that the Relay subset contains the Cache subset.

- Memory (information storage):

Receives: RBRqst, WBRqst

Replies: RBRply, WBRply

This subset is fairly simple. It is concerned only with WriteBlock and ReadBlock. It behaves as a pure slave. The only complexity is to support the shared/owner lines to some extent.

- Consumer:

Sends: ReadBlock

Receives: WSRply, CSRply, RBRplyCorr (all optional)

This subset is for devices that need to read information from memory, but never write. It is very simple, as it is only concerned with ReadBlock. If the device cares about getting completely consistent data, it must also make use of WSRply, CSRply, RBRplyCorr. This might be useful if the consumer implements an internal read-only cache tracking DynaBus transactions (this is the way an advanced display controller would work).

- Producer:

Sends: ReadBlock, WriteBlock, WriteSingle, ConditionalWriteSingle (optional)

Receives: RBRqst, WSRply, CSRply, RBRplyCorr

Replies: RBRply, RBRplyCorr

This subset is for devices that need to write information into memory. The current specification of the DynaBus imply that such devices must behave exactly as caches (even if they have only one entry) except for Map transactions because it is impossible to use WriteBlock to write fresh data and respect the consistency scheme. Implementing a special flavor of WriteBlock that would update cache entries is complex due to the output FIFOs in caches that would need an invalidation mechanism. Future updates to the DynaBus might nevertheless add such a transaction to improve the efficiency of producers.

- Relay:

Sends: to be defined

Receives: to be defined

Replies: to be defined

This subset is for devices that connect between two levels of DynaBus in the hierarchy. The subset is fairly complex (such devices must behave as memory and terminator for the lower level, as caches for the upper level), but the DynaBus specification was designed so that it would be supported. There is no plan to have such a device in the June87 machine, which uses a one-level DynaBus.

- Terminator:

Receives: WSRqst, CSRqst, BIOWRqst

Replies: WSRply, CSRply, BIOWRply

This subset is used by a single device sitting at the highest level of the DynaBus to reflect system-wide requests back synchronously as equivalent replies (logically speaking) on all DynaBus devices and to provide a dummy reply for broadcast transactions. In the current (June87) implementation, this is made part of the task of the memory controller, but it really constitutes a separate function.

- I/O slave:

Sends: IOWrite, BIOWrite (both optional, to send interrupts)

Receives: IORRqst, IOWRqst, BIOWRqst

Replies: IORRply, IOWRply

This subset is used by devices supporting DynaBus IO requests (IORead, IOWrite, BIOWrite). It is quite straightforward. Most devices need to support this subset, for example to send interrupts to notify of error conditions or to provide some status information.

- I/O master:

Sends: IORead, IOWrite, BIOWrite

This subset is used by devices emitting DynaBus IO requests (IORead, IOWrite, BIOWrite). It is quite straightforward.

- Mapping:

Receives: MapRqst

Replies: MapRply

This subset is used by the map cache. It is conceptually similar to the I/O slave subset (for IORRqst), but uses a special transaction to help the bus relay function.

1.2 Subset usage

The following table indicates the use of subsets by the chips designed for the June87 machine.

Small Memory Display Slo Map

Subset Cache Ctlr Ctlr Bridge Cache

Cache YES NO NO YES NO

Memory NO YES NO NO NO

Consumer NO NO YES NO NO

Producer NO NO NO NO NO

Relay NO NO NO NO NO

Terminator NO YES NO NO NO

I/O slave YES YES YES YES YES

I/O master YES NO NO YES NO

Mapping NO YES NO NO YES

The following table indicates the use of subsets by possible future chips.

Big Fast

Subset Cache Network

Cache YES NO

Memory NO NO

Consumer NO YES

Producer NO YES

Relay YES NO

Terminator NO NO

I/O slave YES YES

I/O master YES NO

Mapping NO NO

2. Interrupts on DynaBus

Devices may need to send an interrupt to a processor sitting on the DynaBus. Although there is no specific Interrupt transaction on the DynaBus, interrupts may be sent over the DynaBus. The technique used is to send an IOWrite transaction to a processor cache. This technique has the disadvantage that knowledge of some cache IO registers appears in the design of DynaBus devices. The advantage is that it permits to have both point-to-point and broadcast interrupts with no additional protocol complexity. Future versions of the DynaBus may introduce an Interrupt transaction to remove the dependencies.

2.1 Interrupt model

Interrupt processing is supported through I/O locations in the Small Cache. A Small Cache contains a 32-bit register InterruptStatus containing the current interrupt status. DynaBus devices may OR any value into this register. The corresponding processor will receive an interrupt if this word is non-zero. Each bit is assumed to represent a different "source" of interrupt (i.e. device or group of devices). The processor will read this register and treat each bit set as representing a device requiring some form of service. The reader is referred to [2] for more details.

The support provided by the Small cache is thus minimal: it only collects information supplied by DynaBus devices. There is no support on the DynaBus to acknowledge that an interrupt has been processed by the processor. All the processor can do is remove (ANDing into) bits from the cache's InterruptStatus register.

2.2 How a DynaBus device should generate interrupts

Interrupts have two parameters:

- what processor(s) must receive the interrupt

- what bit(s) should be set in the interrupt status register

Devices sending interrupts must have internal register(s) specifying these parameters. The preferred implementation is to have a 16-bit register InterruptAction writable from the DynaBus containing (MSB first) a 5 bit reason code (encoding one bit position among the 32 of the InterruptStatus register), a 1-bit Broadcast flag (1 if all processors must receive the interrupt, 0 if a single processor must receive it), and a 10-bit DynaBus DeviceID of the processor that must receive the interrupt (must be all 1's when broadcast). Devices may choose to have a second 32-bit register to represent the value to be ORed into the InterruptStatus register instead of the encoded 5 bit reason code. Devices should try to use only one bit in InterruptStatus if possible. If multiple bits need to be used (to represent very different interrupts sources), all of them should be independently programmable.

To send an interrupt, a device must generate an IOWrite (BIOWrite if the Broadcast bit is set). The address for the IOWrite must be of the form (32 LSB, other bits must be 0):

0000 0000 0001 PPPP PPPP PP00 XXXX XXXX

where PPPPPPPPPP is the 10-bit DeviceID in InterruptAction and XXXXXXXX the number of the InterruptStatus register inside the Small Cache (the exact value of XXXXXXXX is not yet determined). This address might be stored by the device in a programmable register for additional flexibility. The top 12 bits of this address (001H) denote the small cache DevType on the DynaBus.

2.3 Other interrupt related issues

Each device needs to have an InterruptSource register containing information on the source of the interrupt. The preferred implementation is to use one bit per possible source. When a processor receives an interrupt, it will search for posted interrupt reasons in the InterruptStatus register, and will read the InterruptSource registers of the interrupting devices (the mapping between interrupt reason number and devices is under software control, thanks to the InterruptAction registers). It will then try to do whatever is needed to remove the source of interrupt (for example reading some data that has been received by the device). The processor must be able to clear individual bits in the device's InterruptSource register either explicitly or by performing some other action on the device.

It is recommended that a device send a new interrupt transaction each time a new interrupt source requires service. Sending one interrupt only for multiple sources is also possible, but some care must be exercised so that no interrupts are lost.

The reader is once again referred to [2] for more details on how the software manages interrupts and what impact this may have on the interrupting device.

3. DynaBus and protection

Protection refers to the ability to prevent user code (as opposed to kernel code) from using certain system resources. This applies to memory access and to IO access. Memory access is protected by the Small Cache through protection bits included in the page descriptors. It does not directly involve the DynaBus, and hence will not be described here.

3.1. IO slave protection

IO requests are not checked by the cache for protection. The solution that has been chosen is to defer protection to the device that will reply to the IO request. DynaBus requests carry a Mode/Error bit in the header. This bit will be set to 0 if the request originated from a non-privileged requester (typically a processor running in user mode), to 1 if the request originated from a privileged requester (typically a processor running in kernel mode or a non-processor device).

Each device must decide what IO locations need to be protected against non-privileged requesters. The protection may further be different for IORead and IOWrite (it is expected, but not required, that IOWrite permitted implies IORead permitted for the same address). If the IO access is protected, the device must set the Mode/Error bit to 1 in the reply packet (c.f. section 4) to deny access to the requester, and should not perform any action on the addresses IO location. The second word of the reply packet must indicate the error reason as explained in section 4.3.

It must be noticed that BIOWrites will never result in protection error, since the reply is sent by the DynaBus terminator instead of the addressed device(s). Such requests will just be ineffective for all addressed devices. This peculiar behavior of BIOWrites is not specific to protection, but is a side effect of the fact that devices may never reply directly to broadcasted requests.

3.2. Protection and non-processor requesters

When a non-processor makes a DynaBus request, it should always set the Mode/Error to 1 to acquire kernel privilege, unless it is working as a result of some action initiated by a processor running in user mode (there is no such case in the current June87 design). It is safe for non-processor devices to just hardwire the Mode/Error bit to 1 for all request packets on the DynaBus.

The action to be taken by non-processors when they receive a reply packet with the Mode/Error bit set is further detailed in section 4.

4. Error management on the DynaBus

The DynaBus offers multiple ways to signal errors. Usage depends upon the error level and upon the circumstances of the error.

4.1. Error signaling

The DynaBus offers three different types of error signals:

- Raising the ErrorOut line on the DynaBus. This is reserved to errors unrecoverable through software or originating from a deadly hardware malfunction. It will stop the whole system and wake up the maintenance processor for a post-mortem analysis of the machine state.

- Setting the Mode/Error bit to 1 in a reply packet. This is applicable only when the error occurs because of invalid information in a DynaBus request, such as protection error (c.f. section 3) or map cache fault. It is the requester's responsibility to take appropriate action (c.f. section 4.2).

- Sending an interrupt to adequate processor(s). This should be used for all errors that do not fall in the previous two categories.

DynaBus devices should use the signaling method most appropriate to the error.

4.2. Error causes

A device may want to signal an error for various causes, such as:

- it is in an unexpected state: ErrorOut if fatal, interrupt if recoverable

- it is required to do something illegal or meaningless: reply in error or interrupt

- some other devices signals an error to it: ErrorOut or send an interrupt to a processor

In any case, the device should provide some information on the error, either in some internal register for ErrorOut or interrupt, or in the reply packet for reply in error as outlined below.

All devices sending DynaBus requests should have a time-out mechanism for the reply. If the timeout expires before the reply arrives, the device should signal an error. The method used depends on the device. Using ErrorOut is not recommended, unless the lack of reply endangers system survival (such as no reply when sending an interrupt...). Most often, an interrupt should be posted. The timeout mechanism does not need to be precise as to the time the error is caught, but all requests that do not get their reply within some time must result in an error.

4.3. Managing the Mode/Error bit in reply headers

To signal an error directly linked to a request, a device should set the Mode/Error bit to 1 in the reply header (the most usual case is protection as explained in section 3). The following word of the reply packet contains information on the error. It should be encoded in the following way (upper 32 bits of data are 0):

IIII IIII IIDD DDDD DDDD DDDD DDDD DFFF

where IIIIIIIIII is the offended 10-bit DeviceID, FFF a 3-bit FaultCode and D designates device-dependent bits of the designer's choice providing enough information so that software may recover from the error. A suggested encoding of the 3 bit FaultCode field is (this encoding is not yet definitive):

000 Reserved

001 MapCache fault (issued by MapCache only)

010 Memory Protection fault (from small cache to its processor only)

011 IO Protection fault (c.f. 3.1)

100 Device Dependent fault

101 Reserved

110 Reserved

111 DynaBus TimeOut fault (from small cache to its processor only)

This table shows that normal DynaBus devices should use FaultCodes 3 or 4 only.

The action to be taken by a device receiving a reply with Mode/Error=1 depends on the device. Processors (in fact small caches) will store the 32-bit explanation, take a vectored trap indexed by the FaultCode, and provide the full 32-bit explanation word to the software trap handler that will process the error. Non-processor devices may either raise an interrupt if receiving an error is a reasonable possibility (for example, a sophisticated IO device might use the map cache to work with virtual addresses and subcontract mapping faults to a processor), or raise ErrorOut if such an error is unexpected.

As mentioned in 3.2, non-processor requesters should probably hardwire the Mode/Error bit to 1 in request headers to get kernel privileges.

5. DynaBus IO support

Most DynaBus devices will have to support DynaBus IO transactions, as noted in section 1.2. A few guidelines apply.

5.1. IO address decoding

As mentioned in the DynaBus specification, IO addresses consist of a DevType, DevNum and DevOffset field. Each device should be selected by full decoding of the (DevType, DevNum) fields for IORead and IOWrite and by decoding of DevType only for BIOWrite. Devices may (but should avoid) require that BIOWrite also have a specific DevNum (such as all 1's). The DevNum should be derived from the DynaBus DeviceID in a non-ambiguous manner: devices using only a small address space should use exactly the full DynaBus DeviceID. Devices using medium and/or large address spaces should use the low-order bits of their DeviceID. It will be the responsibility of the maintenance processor to assign DeviceIDs at bootstrap time in such a way that the DevNum are non-ambiguous within a device type.

5.2. Reading and writing

It is strongly recommended that IO registers that may be written from the DynaBus may also be read from the DynaBus. Although it is possible to maintain shadow copies in software, it usually makes debugging and fault analysis more difficult.

Registers accessible through DynaBus IOWrite should also be accessible through DynaBus BIOWrite. If the decoding strategy outlined in the previous section is used, this is no problem.

5.3. Support of multiple simultaneous IOs

Since there are multiple processors, multiple IO requests may be sent simultaneously (i.e. in between request and reply) to a given device. It is recommended that devices have an internal IO request queue to support such cases. An alternate possibility is to trust a software lock and just drop on the floor any IO request that comes in the middle of another one (thus causing the requester to end up in timeout). This solution should only be used sparingly for devices that expect a very low rate of IO, such as the display controller.

6. Interaction with DBus

The DBus interferes with devices on two occasions: at bootstrap time to setup some internal registers and when the system fails to analyze the origin of the failure. The hardware interface with the DBus is described in detail in [3].

6.1 DBus and the standard cell methodology

Most of the DBus functionality for standard cell designs will be provided through pre-defined blocks of logic in the DBusLogic design file [5]. This will include the DBus interface and control logic, chip identification and DynaBus DeviceID. Standard flip-flops and memory arrays of the Logic library [4] will support LSSD automatically, and special flip-flops will be available to support special logic that need to be kept out of the scan path.

A design may be part standard cell, part custom logic. In that case, the designer should use the signals generated by the DBus interface to provide some LSSD facility for the custom logic blocks.

6.2. DBus and initialization

At initialization, the DBus is used to read the chip identification of each device in the system, from which it derives the system configuration and/or checks it against a presumed configuration. Chips using the standard-cell approach will contain a predefined block of logic implementing this function (details are yet to be defined). The chip identification is specific to each implementation (set of VLSI masks). Refer to [3] for more detailed information.

The DBus is also used to setup the DynaBus DeviceID of all DynaBus devices. In order to simplify the bootstrap process, the DeviceID register should be at the same location in the scan path for all devices (e.g. the first 10 bits, excluding chip identification). This should also be supported more or less automatically for standard-cell designs. This constraint permits to minimize the amount of information that must be known at bootstrap time by the debug processor (only a chip identification to scan-path length table).

Normally, devices should start full operation as soon as the DBus nDFreeze signal is rescinded. Some devices may find it necessary to keep certain parts of the chip inoperative until some software action has been performed (delayed reset), such as caches to prevent processors from starting too early in the bootstrap process, or the display controller to prevent display generation before proper initialization. In that case, this delayed reset should be obtained by an internal flip-flop.

6.3 DBus and debugging

The DBus implementation suggested in [3] provides all the functionality necessary to debug standard-cell designs. The design should such that storage elements that are not part of the scan path, such as RAM and FIFO, are nevertheless accessible indirectly through the DBus using the Execute function to single-step the chip. It may be necessary to remove certain flip-flops from the scan path if modifying them would result in undesirable side-effects, but this should be exceptional.

References

[1] Sindhu: "DynaBus Logical Specifications", in [Indigo]<Dragon>DynaBus>DynaBusLogicalSpecifications.tioga, .interpress

[2] Frailong, Sindhu, Atkinson: "Dragon Interrupt Management", in [Indigo]<Dragon>Interrupts>InterruptsDoc.tioga, .interpress

[3] Frailong: "The Dragon DBus", in [Indigo]<Dragon>DBus>DBusDoc.tioga, .interpress

[4] Monier: "Logic Library", in [DATools]<DATools6.1>CellLibraries>LogicDoc.tioga, Logic.dale

[5] ??? : "DBus macro cells", in [Indigo]<Dragon>DBus>DBusLogic.tioga, .interpress, .dale

ChangeLog

Jean-Marc Frailong October 21, 1986 5:55:56 pm PDT

Editorial modifications after proofreading (thanks to PS & JH)

changes to: most of the document, added References section