*start* 12061 00024 US Date: 7 Feb. 1982 4:35 pm PST (Sunday) From: Stewart.PA Subject: Lark redesign notes To: VoiceProject^.pa Notes on 8088 memory references. The 8088 has a multiplexed address and data bus. Bits A8-A15, the high order bits, appear on dedicated 8088 pins and are stable throughout a cycle. Bits A0-A7 appear on the data bus during the early part of a cycle and are latched by a signal called ALE ("Address Latch Enable"). Since the 8088 data bus is only 8 bits wide, 16 bit references are done by two 8-bit references to adjacent addresses. Notes on memories involved. The main processor memory system consists of 64K bytes of read/write memory plus 8K bytes of read only memory. The ROM is necessarily selected by the highest addresses because that is where control is passed upon a system reset. The 8088 has a 20 bit address, and control is passed to location FFFF0 on system reset. We could, for example, decode the address so that the low 64K was RAM and the rest was multiple copies of the ROM. The C compiler only knows about 16 bit addresses, and in fact depends on all the 8088 segment registers addressing the same 64K chunk of memory. If we desire to run C generated code in the ROM (yes), then the ROM should occupy addresses from E000 to FFFF, restricting us to 56K of RAM. If we do it that way, we need not even decode the high 4 address bits. One additional possibility is to allow the DMA controller, SLC, and Slave processor to address all of RAM. The "hidden" 8K of RAM could be used for audio buffers, which are never directly examined by the CPU anyway. The slave processor memory system consists of 8K bytes of read only memory, plus access to some portion of the main memory RAM. Normally, the slave processor executes a single group of instructions 8000 times per second - the voice sample rate. What the slave CPU does The processor waits at the top of the loop until a 125 microsecond "tick" occurs. This pause is accomplished by the 8088 WAIT instruction. This instruction pauses until a signal is activated on the 8088 TEST pin. During a single execution of the loop, the slave processor: 1. Reads an 8-bit input voice sample from the input shift register ("microphone"). 2. Accumulates a 16-bit table lookup function of the sample for silence detection. This takes a 256 word table. 3. Translates the voice sample through a 256 byte table selected by a table pointer read from shared memory and stores the result in shared memory in the appropriate spot in the input ring buffer. The translation is for echo supression - the table changes from millisecond to millisecond according to whether a forwarded call is in progress and according to whether "output" voice is present. 4. For each of three participants in a conference call, reads the appropriate voice sample out of the output ring buffer in shared memory and stores a zero in its place. 5. The three u-law samples are translated to 16 bit linear form via table (possibly with some gain reduction) and added together. 6. The sum is tested for overflow and. if necessary, clipped. 7. The sum is translated back to 8-bit u-law format and stored into the output shift register (the "speaker"). 8. The ring buffer pointer (a common pointer is used for all four buffers) is incremented and tested for overflow. 9. On packet boundaries the silence detection value is saved in shared memory and the accumulator zeroed. Hardware assistance In order to reduce the number of instructions in the slave processor loop, a number of the above steps have hardware assistance. Reads from a region of the slave processor address space associated with the input shift register cause 8 bits of the processor generated memory address to be replaced by the shift register contents. In effect, the IO device is hung on the address bus. The shift register replaces bits A1-A8 of the address. The ISR region of the address space contains a number of 256 byte tables. The entries of several tables contain various gain reduced versions of their address. Suppose that there are 4 tables, containing gain reductions of 0, 5, 10, and 15 dB. These tables are used for echo supression. By reading from a certain address, one can obtain the input voice sample multiplied by a certain factor. Two tables, at "adjacent" even and odd addresses contain the low and high bytes, respectively, of the 16 bit silence values associated with the shift register contents. By doing a 16 bit read from the even address, one obtains the 16 bit silence value. Reads from a region of the slave processor address space associated with the output shift register cause the 8-bit result of the read to be written into the output shift register. One does an 8-bit read using the output shift register base address indexed by the linear encoded voice sample. The table accessed by such addresses produces the corresponding 8-bit u-law sample, which is loaded into the OSR. There is an additional complication. In order to halve the size of the table, negative values are handled by the same table as positive values, but a different base address is used. The different base address causes the sign bit of the output shift register to be loaded with a one or zero as appropriate. (u-law samples are sign-magnitude encoded.) Word reads from a region of the slave processor address space associated with shared memory ring buffers cause a kind of read-modify-write cycle to take place. The slave processor provides the address of the appropriate byte in the ring buffer as the first fetch of the word read. This first address is latched by the address drivers that provide the address to the shared memory. The result of the shared memory read is latched and then replaces bits A1-A8 of the slave memory address. Bit A0 is provided by a flipflop, since the ring buffer address might not be even. The slave memroy provides the the 8088 the low byte of the 16 bit translated result. The 8088 then produces an incremented address for the second byte of the word read. This address is ignored by the shared memory, which again uses the old address. For this second cycle, the read of the shared memory is turned into a write and a zero (supplied by a zero-supplying chip) is written into the shared memory at the address just read from. Meanwhile, the latched result of the read from the shared memory, together with the toggled flipflop, provide the new address to the slave memory. The second slave memory read produces the high byte of the translated 16 bit value. The problem with all this is providing the high bits of the slave memory address. The high address from the slave processor need to do three jobs: select the shared memory in read,write-zero mode, select the correct translation tables in the slave memory, and select the correct ring buffer in the shared memory. Since the 8088 will increment the address between the two cycles, either the first address must be latched, or the address must be arranged so that a carry will not propagate too far into the upper bits. The shared memory address is no problem, since it is latched by the first cycle. Selection of read,write-zero mode is provided by the highest bits of the slave processor address and are high enough in the address to be out-of-the-way. (Some of the intermediate bits will be a problem long before the high bits are a problem, so we ignore them for now.) We also need to be able to make "normal," e.g. untranslated, reads and writes to shared memory. These are accomplished still by translating through the slave memory, but using a "straight through" table. The shared memory address is latched on all cycles in these cases, not just alternate cycles. Types of slave processor memory cycles: Untranslated byte and word reads from ROM (instruction fetches) Translated byte and word reads from input shift register Translated "read" to output shift register Untranslated writes and reads to shared memory Translated word reads from shared memory, with the second cycle turned into a write-zero in the shared memory. Allocation of slave processor address space: The ROM must occupy the highest addresses in order to catch system reset A copy of the ROM might also need to occupy low addresses in order to catch interrupts (Non-Maskable-Interrupt especially). The Output shift register needs 4K bytes of address space, possibly 8 K bytes if 13 bit linear values are used for extra precision. The shared memory ring buffers need at least 8K bytes of space for transl;ated reads/non-translated writes The shared memory needs a few words of space for non-translated reads. Allocation of slave processor address bits Suppose we break up the slave processor address space into 8K byte regions 0-7. Regions 0 and 7 will address the ROM directly. Region 1 will address the input shift register. Region 6 will address the output shift register. Regions 2 and 3 will address a 16K byte window of the shared memory for non-translated reads and writes Regions 4 and 5 will address a 16K byte window of the shared memory for read,write-zero cycles. Slave CPU bits A15-A13 will select the region. Both translated (regions 4 and 5) and non-translated (regions 2 and 3) shared memory references will use the same window. The high order two bits of the shared memory address will be hardwired to select the shared memory 16K byte window. (Possibly a small amount of logic will be used to set the window to regions 5 and 6 of the shared memory, just below the main CPU ROM.) In regions 0, 6, and 7, slave CPU bits A0-A12 will be directed to the slave ROM. In region 6, the output shift register will be enabled to grab bus data on reads. In region 1, slave CPU bits A0 and A9-A12 will be directed to the slave ROM, bits A1-A8 will be generated by the input shift register. In this region, a CPU address carry will propagate, possibly into other regions, so care must be taken to use addresses with intermediate 0's. There is a flipflop, called the mode flipflop, which is sometimes used to provide the low address bit to the slave ROM. When the mode flop is cleared, slave CPU ALE will load the shared memory address latches. When the mode flop is set, the zero-source will be enabled onto the shared memory data bus and slave CPU reads into shared memory writes. The mode flipflop has a control line enabled by certain regions. When asserted, the mode flop will toggle at the end of a read, otherwise it will be cleared at the end of a read. The mode flipflop will be cleared by the same signal that is attached to the TEST pin of the slave CPU, so that the flop is in a known state at the beginning of the loop. (This may not be necessary if a "clearing" reference is done early in the loop.) In regions 2 and 3, slave ROM addresses A1-A8 will be provided by shared memory data. Address A1 will be provided by the mode flipflop, but the flipflop will not toggle between cycles. In regions 2 and 3, slave ROM addresses A9-A12 will be forced to a particular value which will select the straight through translation table. In regions 4 and 5, slave ROM addresses A1-A8 will be provided by shared memory data. Address A1 will be provided by the mode flipflop. In regions 4 and 5, the mode flop will toggle between cycles. In regions 4 and 5, slave ROM addresses A9-A12 will be forced to one of two values, selected by a parallel port bit from the main CPU. The two tables thus selected will provide translation from 8-bit u-law to linear with 0dB and XdB of gain reduction. The choice is needed for conference calls I think. Additional notes: I wonder whether it ever happens that the 8088 does an ionstruction fetch between the bytes of a word read. If so, would this foul up the read,write-zero cycle? If, for some reason, we wanted to add a chip of RAM to the slave CPU, it should be addressed by region 0. (For example, this would make it possible to have the slave CPU do all sorts of things not considered before...) *start* 00628 00024 US Date: 8 Feb. 1982 11:58 am PST (Monday) From: Swinehart.PA Subject: Re: Voice on Ethernet, etc. In-reply-to: Your message of 27 JAN 1982 1352-PST To: GONSALVES cc: Ornstein, Stewart, Swinehart Sorry to be so long answering. Forest just gave me a draft of your paper. It looks very good: both as a paper and in terms of its conclusions. We are at a point where it would be pretty embarassing if the thing didn't work. Things have been crazy, but are easing up. When would be a good time for you to come see our toys (assuming you haven't been doing that all along) and talk with us? Dan Swinehart *start* 18122 00024 US Date: 13 Feb. 1982 9:28 pm PST (Saturday) From: Stewart.PA Subject: SLC answers To: VoiceProject^.pa Replies from Jim Okazaki 29 Jan. 1982 12:49 pm PST (Friday) Replies from Frank Nelson 1 Feb. 1982 2:04 pm PST (Monday) --------------------------- Date: 29 Dec. 1981 5:07 pm PST (Tuesday) From: Stewart.PA Subject: SLC notes To: Swinehart, Ornstein cc: Stewart Old notes: How many consecutive cycles does the SLC take in DMA mode? We are cascading the SLC through an 8237 which we think does not enforce single transfer mode when cascading. Frank Nelson 8-823-7311 Will get back. Suppose the SLC ran faster (e.g. 3 Mb). What would that do to the timing requirements on the processor side? Does it make the Read and Write pulses shorter in an always READY system? Frank Nelson Probably won't get any anyway. Yes. If SLC runs faster than 6 MHz. Everything contracted. See table called Output timing req. Pg. 45 in SLC FS Page 45 of (some version) of the SLC spec remarks that Holda may be dropped to hint to the SLC to give up control of the processor bus. What synchronization requirements are there? Page 57 gives some marks on the timing diagram, but it is in terms of SSB timing and the internal clock, not the external world. Holda does not have to be synchronized. Other device drops holda, slc gives up one cycle. Frank Nelson See Sentry Test data When chip is doing DMA, will keep it until fifo full/empty. You can stop dma after every third dma count ale and drop holda after nth ale. From leading edge of NALE going low, drive holda low, then that ref will be last. (must drop within 2.5 T - 100 ns) T = 333 ns for 6 MHz. Can wait until Rd or Wr, then only have 1.5T-200. Should drop at ale or strobe time in order to satisfy setup time. That will be last byte of dma transfer. Hold will drop for 1 clock then come back up if the slc still wants bus Nowhere have we found any information on the internal register addresses of the SLC. What registers are there and at what addresses? What requirements are there for which piece of processor address space is occupied by the SLC? Mask programmed. Decodes low 8 bits. Bus processor puts out address internally, but noone responds hence never becomes READY. There is a remark that when chaining, a ccb must specify at least a 7 byte data transfer. Does this refer to the ccb before the chain or to the one after the chain? Can the final ccb of a chain specify less than 7 bytes? The CCB with the chain bit is the one which must also specify 7 byte data transfer. Fifo can cover other up to 7 byte. TIem us used in microcode to set up iop for next ccb. SLC will fill fifo before persuing chain bit. In SLT, command oicked up, address picked up, then byte count, stored internal, then fill up fifo. Only then does transmission start. So if the first ccb of a multi ccb packet is less than 7 bytes, the fifo will run dry. After iop has picked up last data byte, then keeps bus to pick up next ccb. Would get data overrun What about inserting an indirect ccb in the middle of a chain? Can a transmit chain include an indirect ccb "in the middle" of a single outgoing packet? "It should" But can't right now. Will work on receive. Does the SLC set the status field of a ccb to 0 as it begins the processing of a ccb and then set it again when the ccb is finished? No! Does the processor have to clear the status byte before submitting the ccb to the SLC? Does this mean that in a ring buffer set up for receiving that the processor must "chase" the SLC handling packets and clearing the status bytes? YES (Otherwise the processor couldn't tell which ccb the SLC was working on.) When you give an SIO to rx or tx. DFMA request made for cmd and flag and for memory address. Only step through ccb ONCE, must wait at count field until end of packet. New Notes: Documents: Integrated Circuit, Shared Line Controller (SLC) 733W02237 Shared Line Controller (SLC) Functional Specification 156P11875 Input/Output Processor Macro Functional Specification 156P11194 Shared Line Receiver and Transmitter Functional Specification 156P11197 Reading from the addresses 4n+1 and 4n+2 in the SLC return CBAH and CBAL respectively. Do they return the current address, the original address, or the "next" address. Can the processor read these registers in order to find out what the SLC is doing? What happens when there is a carry accross a 256 byte boundary of the 16 bit CBA. It is a 16 bit register. Could the processor read one byte before the carry and the other after the carry? Is it atomic? Yes I can get in trouble. (Operational experience is that after the SLR is started, but before a packet comes in, the IOPR registers point to the count field of the ccb. After a packet has come in, the CBA points to the byte after the status byte.) Does the SLT depend on getting a collision indication during the first 1 or 2 bytes in order to backoff and retry? Or is a collision near the end of a packet transmission cause a retransmission from the beginning? Only first byte! If later, you get collision error indication. Late collisions are message aborts. Early repeated collisions that overflow retry counter cause Collision abort --- For Frank Nelson The pin IO/NM is quoted as sometimes being a mask option. Is it or isn't it in the chips we've got? P. 34 and p. 37 of the SLC FS seems to say that IO/NM is ignored on input and always low on output (DMA), while p. 11 says there is an option. on Input, assumed always 1 on Output, always driven to 0 (regardless of option) Masked off right now. Pin could be put onto internal SSB if the option was the other way. Page 37 of the SLC FS says NCS ignored during DMA cycles. That's right. If doing DMA and address overlapped internal space of space, chip might address itself! Page 43 of the SLC FS says that DOut goes to HI-Z during system reset. This has the effect of jamming the Ether until either the transceiver cuts off or system reset goes away. Dout may STAY HI-Z up to 3T+100ns after NRESET goes high, so it would be hard to gate off DOut during RESET. That's right. Too bad. Board test considerations. Might check with circuit people, pulldown on data out? Page 40 of the SLC FS says Hold goes Hi-Z while NRESET low. Does this cause us any trouble? Same problem. In our case, reset also resets DMA controller, which will ignore Req input. --- above Frank Nelson Page 18 of the IOP FS says that a 7 byte transfer is needed to avoid data lates on chaining. Doesn't this depend on the device data rate? What will happen with the Ether running at 3 Mbit/s? Would be worse if external memory needs wait states. No, only a function of microcode time and slc clock. When using the IO interrupt, must the IO bit be turned on in all ccbs? What will happen if an unusual end happends in a ccb with IO turned off? Interrupt only armed during ccb with bit on. Interrupt is lost if bit is off. No distinction made between normal and unusual end In the last ccb of an auto chain, does the 08H bit, (Auto/skip) have to be turned on? (Operational experience is that if not, then the status byte is stored as though it was a 5 byte ccb rather than a 3 byte ccb. I don't know what happened to the data address field, whether it was used or skipped.) Must NOT be turned on. In a single ccb packet, can use either transmit message or transmit segmented message (we think). When an SLR ccb starts with a count field of zero, the interpretation is that there is 256 bytes of space available. The count increments to zero or when a packet ends, the SLR stores the count back into the ccb. By starting at 0, the size of a packet can be found by count _ IF ccb.count=0 THEN 256 ELSE ccb.count; What happens if the initial value of the SLR ccb count field is not zero? Does this indeed tell the SLR that less space is available? What value will be stored back a) on end of packet. b) after the allow number of bytes are used up. Is the correct calculation count _ IF ccb.count=0 THEN -initialcount ELSE ccb.count-initialcount; ? Yes that's right. Does the store-back of the count fields in the input and output-unusual-end cases mean that the processor must re-initialize those ccb fields before reusing the ccb? In transmit, never update count field, must clear status field. In receiver, must reinitialize count field. I don't understand the interaction of Auto/Skip and Indirect given on p. 19 of the IOP FS. Both auto and indirect then indirect bit wins. ccb after one with bwith bot indirect and auto skip is treated at 3 byte ccb. IOP FS p. 25 (top) Does the IOC set the ccb status byte to zero as it "enters" a ccb? NO (We have noticed that HIO doesn't work on an SLR which has been started but which has not gotten a packet.) IOP FS p. 27 What does the "(MMIE = 1)" note in the HIO box mean? Does this mean that message mode interrupts must be enabled before HIO works? Means MM interrupt will be triggered if enabled in this case. Is it possible that HIO is not effective until a transfer is in progress? THATS RIGHT. Issues micro interrupt. Must use RIO. RIO doesn't clear modes. Hardware reset clears id register and mode register. RIO only clears state rgister and CBA, also forced available state. Does it work to try and receive a packet exactly 256 bytes long? (or exactly the length of an input ccb). IF this happens. Status will be stored in first ccb. IOP FS p. 32 "One OR MORE (emphasis mine) messages may be contained in a single ccb (CHAIN)." What does this mean? When we use the SLR and a packet completes in the "middle" of a ccb. The count is stored and the SLR startes up again on the next ccb, not the same one. Can only only have one message in a single ccb. There are a couple of comments that IO and MM interrupts should not be used together, yet there seems to be no hardware reason why not -- there are separate bits for the two kinds of interrupts. Is this advice? One "should not" use them together because it doesn't make sense? Chip gets too confused if you try it (both RX and TX). MM is probably more useful. Transmit MM only interrupts on errors. Receive MM interrupts on every end of packet good or not. IO interrupts on every end of ccb. IOP FS p. 48 There is a flowchart which implies that after an IO interrupt without a "valid end" bit, one must execute an HIO before checking. What is the correct interrupt handle sequence for AIO, HIO, RIO, etc? HIO forces available state. Could be that IOP is hung is working state, HIO will clear it. The code on page 49 that supposedly implements the flowchart does not match the flowchart. Flowchart is right IOP FS p. 58. The ring buffer example flowchart may hold clues to the combination of indirect and auto/skip, ccb bits, but there is at least one type in the code. 01100 should read 0100 SLRT FS page 22 says that "Message End" has to do with ccbs and that ME=0 means message aborted. Is this true? YES. Usually assoc with transmitted. Rcvr would get ME=1, CRC errpr Tx would get ME=0 because late collision so message didn't end. SLRT FS page 25 flow chart for the SLT seems to say that collisions after the first 2 bytes have been transmitted are fatal (no retry attempt). This makes sense since there are two bytes of buffering in the SLT ina ddition to the FIFO. Perhaps the FIFO cannot be backed up! After FIRST byte are fatal Since the SLR has some buffering before data enters the FIFO, the SLR could reject packets shorter than a certain length (runts). Does it? Collision fragments get through receiver. will get ME=1 bad crc probably SLRT FS p. 38. There are three signals between the SLR and the SLT: Carrier, Collision, and TestData (SLRTD). What is SLRTD for? Is there some kind of loopback test? Chip debugging. Ignore it. SLRT FS p. 39 The description of Message End status says that ME is set at the end of a single ccb or chain of ccbs. Does this mean that one cannot queue multiple packets in the same ccb chain? Why doesn't ME mean "Packet end" rather than ccb end? It does mean Packet end. You can have ccb chain with multiple messages. Figure 5-4 says that 00x0 turns off the SLR/T, but also says that 10x0 turns off the SLR/T "at the end of chain". Isn't 10x0 always the end of a chain? "at the end of chain" not relevant SLRT FS p. 41. ID address is the source for SLT packets. Does this mean that the SLT substitutes this value for the 2nd byte of outbound packets? Or does it us the value in memory, whatever it is? Uses memory byte. SLRT p. 42 recommends initializing the SLC by Load ID Load SLR mode Load SLT mode Start SLR Start SLT (when packet ready to go) Is this a required initialization sequence? What can be done differently? Yes required. Order is important. If start TX without Rx, then start rx, tx gets clobbered. Another good reason for ring buffer Rx. Load SLT mode is optional. SLRT FS p. 43 describes the IOCT/R as using 24 bit ccb addresses and the Pinout of the SLC is all wrong. Left over from previous chip, to be deleted All the code examples use 24 bit addresses. Eliminate M.S. byte SLRT FS p. 48 advocates use of a count field fvalue of 01 for SLR mode setup. Should'nt this value be FFH since the IOCR increments the count? What should the count field be for LOAD ID? Load mode, doesn't matter what byte count field is. True for both SLR and SLT. Load ID is same. SLRT FS p. 49 example of multi-ccb message transmit uses Transmit Message in the first ccb and Transmit Segmented Message in the second ccb, while on page 51, it is the first ccb that uses Transmit Segmented Message and the second which uses Transmit Message. Works in the sensible way. Page 49 is wrong Other comments and questions: HIO doesn't seem to do anything to the SLR. I wanted to receive a potentially 554 byte Pup, so I set up a 3 ccb chain for the SLR. After receipt of a short Pup, the SLR went on and started working on the later ccbs. HIO didn't stop it. I used RIO and completely reinitialized the SLR after every packet. See ewarlier Does RIO work the same way as system reset? Can I safely use RIO without reloading the ID and SLR mode? (I checked briefly and it seemed that RIO did not clear the ID register, but I didn't trust it.) See earlier What should the count ccb fields be set to when loading the SLR ID and Mode? anything works Is it ok to build a ccb chain to load both the SLR ID and Mode? Is it OK to have these initialization ccbs in ROM? If in RAM, will the SLR change any of the fields? Yes, works fine, if you don't need the status field. Initializing the SLR while the SLT is transmitting a packet causes Message Aborts (ccb status E4H). It shouldn't do that should it? How much coupling is there between the SLR and SLT? It does, too bad. The SLT status register seems to go to condition code available well BEFORE the status field of the last ccb has been filled in. It shouldn't do that should it? How does one tell when the SLR and SLT are really idle? Try it again, should not behave that way. Areas about which I'm confused: Does it take less time to handle a 3 byte ccb than a 5 byte ccb? Does this change the 7 byte minimum transfer? Would be less. I think I want to use a ring buffer for the receiver and message mode interrupts. Is a MM interrupt really for "packet complete"? YES How does the processor tell where in a ccb chain the SLR is? Can look at CBA There is a comment that the processor must field MM interrupts as fast as they are generated, because if the processor doesn't issue a CMI before the next one happens it will be missed. This may happen if short 'runt' packets are not filtered out by the SLR. If it does happen, the processor could fall farther and farther behind the SLR in running around a ring buffer. How are high performance receiver routines written? Highese user so far is 3%. Is the MM only a hint then? Should I use IO interrupt? IO would be even worse. Might not need interrupts at all. If you cant keep up one way, prob acant at all. On each interrupt, the SLR interrupt handler will have to follow around the ring of ccbs until it gets to the last ccb with status filled in. It will have to clear the status as it goes. Will this scheme work? YES. Others use this In the ring buffer, what happens if a packet being received falls accross the region of the ring with the indirect ccb? Will it work correctly? YES. What does MM interrupt mean in the SLT? Unusual end only What is HIO for? Not implemented that way. Use in interupt unusual end. Can the SLR receive back to back packets? When the ccbs are 5 byte? 3 byte? separated by an indirect ccb? How close can packets be to one another? Back to back OK. SLT waits for minimum time after end of prev. received. Time on shared line between packets IN [23..32666] SLC osc times. Two bits of jam transmitted. (8 clocks) Backup runs in 128 T osc Documentation doesn't say anywhere that addressing nonexistant registers in the SLC hangs the system forever. Okazaki: Note will be added to documentation. Frank Nelson: It hangs system, yes. We could use examples of operational SLC drivers. Can get names. Roch. might be more useful Functional test program, runs in 8085. George Root 8-823-7796. Manager: Steve Wilczek 8-227-1620 Better (programmer): Ron Iptolito 8-227-1652 Testing: several testers. Bench tester Lotus system tester specific for 10 MHz 8085A external clock environment Might affect rest of tester to use 12 MHz slc clock (affect TTL elsewhere) Fairchild Century: Tester could run at 3Mb shared line rate. Might not give indication of all system trouble because century uses same clock as procesor and slc. Some parts pass century but not bench test, Century has rather fixed signals, but on bench test, duty cyclesa nd voltages can be mnore easily varied. -Larry *start* 12679 00024 US Date: 14 Feb. 1982 4:36 pm PST (Sunday) From: Stewart.PA Subject: Slave 8088 design notes To: VoiceProject^.pa cc: Stewart This is a bit of a ramble. As usual, all sil files are on /Indigo/Voice/ETP/ Refer to drawing ETPArch05.sil for the Slave system data paths. 8088 notes: 8088 bus cycles take 4 clock periods, or T states: T1, T2, T3, and T4. T cycles switch when the clock goes low. The clock is low for the first 2/3 of a clock cycle and high for the rest. ALE occurs during the first 2/3 of T1. Rd' and Wr' are active during T2 and T3. DEN' is active from the middle of T1 to the middle of T4. DT/R' has the same timing as address bits and indicates whether a cycle is a read or write. 8088 inputs: SysClock - same as main CPU clock. The two systems share a single 8284A clock generator. SlaveReady - generated by shared memory arbitration logic SlaveReset - generated by system reset, possibly OR'd with a main CPU parallel port bit SlaveTest' - The slave CPU executes the WAIT instruction once per loop. WAIT waits until the TEST' pin is activated. Are there synchronization requirements? Use something like SYNTSN to set a flop to generate SlaveTest'. The flop would be cleared by something like SelISR' that occurs early in the loop. Probably it would work if SYNTSN' were used directly because the WAIT instruction tests the pin fairly often, but why take chances of missing it? SlaveNMI - Non-maskable interrupt. Generated by a main CPU parallel port bit. This signal is used to jar the slave CPU out of the audio loop and have it do something else. I have no idea what. The capability to do this is why Region 0 exists - interrupt vectors are stored in low memory. 8088 Outputs: SlaveRd' SlaveWr' - 8088 read and write signals Slave ALE - address latch enable. Used to latch the CPU generated address from the multiplexed address/data bus. ALE goes high for about half a SysClock. Typically transparent latches are enabled while it is high in order to make the latch outputs valid slightly earlier than if non-transparent latches were used. SlaveIO - Not needed, since the slave CPU never references IO space. SlaveDEN' - Not thought to be needed. This one can be used to enable things like data bus drivers. For reads it is active from the middle of T2 to the beginning of T4. For writes, it is active from the middle of T1 to the middle of T4. SlaveDT/R' - early warning of reads and writes. Same timing as addresses. Control signals: SelSlaveROM' - chip select for slave EPROM SlaveRd' - output enable for slave EPROM. (Note: This would have to change if Region 2 and 3 reads came directly from the shared memory to the slave data bus without going through the EPROM.) SelISR' - gates input shift register onto bits SLA01 - SLA08 of the slave EPROM address. LdOSR' - puts the output shift register into load mode (rather than shift mode) OSRClk' - the OR of AudShClk (used for shifts) and the AND of SlaveRd' and SelOSR' (used for loads). The output shift register loads data from the slave data bus during Region 6 reads. ShareALE - latch enable for addresses going from the slave CPU to the shared memory. For untranslated reads and writes of shared memory, (reads and writes with addresses in regions 2 and 3), this signal is identical to SlaveALE. For translated read,write-zero cycles (Regions 4 and 5), only the "first" ALE of a pair is passed to ShareALE. When the Mode FF is high, there is no ShareALE and the shared memory gets the previous latched address. EnShareAddr' - Gates the latched shared memory address onto the main memory address bus. This signal can be just SlaveDAck', since all shared memory cycles need an address... EnShareZero' - Gate a 0 onto the shared memory data bus. Used during the "write-zero" portion of a region 4 or 5 cycle. It is just the AND of SlaveDAck' and Mode. EnShareWrite' - Gate the contents of the slave data bus onto the shared memory data bus. Generated by the AND of SlaveDT and SlaveDAck. (n.b. write-zero cycles are NOT accompanied by SlaveDT since the slave CPU thinks they are reads.) EnLocalAddr' - Gates bits 1 to 8 of the slaver CPU generated address to the slave EPROM. This happens in regions 0, 1, 6, and 7. EnShareRd' - Gates the latched data from the shared memory to bits 1 to 8 of the slave EPROM address. This happens in regions 2, 3, 4, and 5. EnShareRd' may turn out to be identical to SelShare', stay tuned.\ ShareDLE - Date Latch Enable. activates the transparent latch grapping data from the shared memory. In regions 2 and 3, this signal is identical to SlaveRd'. In regions 4 and 5, it is SlaveRd' gated with Mode' since one wants to latch the data from the shared memory read, then remember it for the second cycle of a region 4 or 5 read, during which the write-zero cycle is taking place on the shared memory side. Now refer to page ETPArch06.sil for how these signals are generated. There is a flip flop, called Mode, which is always low except during the write-zero part of a shared memory "read" in region 4 or 5. This flop is cleared to zero by SYNTSN' or some other convenient signal which occurs at the top of the audio loop. (This assures positive reset occasionally...) During the course of execution of the slave loop, Mode can potentially change state at the end of every SlaveRd'. The new value is NewMode, which is generated by a PROM from the old Mode, the Region, and SlaveDT. The logic in the PROM toggles Mode after every read in regions 4 and 5. It is the programmers responsibility to do only WORD reads in regions 4 and 5. If this is done, then Mode will be high only for the second cycle of such word reads. These second cycles are used to do the writing back of zeros into the audio output (loudspeaker) ring buffers in shared memory. Mode high disables the shared memory address latch ShareALE, forces the low slave EPROM address bit to 1, enables the Zero (EnShareZero') and turns SlaveRd' into MWr'. There is a side question here about whether WORD references are always atomic. Suppose not, then the slave CPU might try an instruction fetch between the cycles of a region 4 or 5 word read. The instruction fetch would find Mode high, which would force SLA00 to 1, which might not be the right address for the instruction... We should ask Intel this one. The slave memory EPROM address is 13 bits (8 K byte EPROM) in three groups bit 0, bits 1-8, and bits 9-12. SLA00, the lsb, comes from the CPU in regions 0, 1, 6, and 7. SLA00 comes from Mode in regions 4 and 5. SLA00 is forced to 0 in regions 2 and 3. Note: SelShare' (active in regions 2, 3, 4, and 5) directly clears the flop which latches SAD00 on SlaveALE. This is what generates the 0 in regions 2 and 3 and which provides the 0 when Mode is low in regions 4 and 5. The 1 when Mode is high is provided by TTLHigh at the input of the SLA00 selector section. I mildly disapprove of directly resetting a flop and overriding the clock input, so there may be a cleaner way to do these functions. Bits SLA01-SLA08 of the slave EPROM address are a tristate bus driven by the CPU address bits 1-8 (via EnLocalAddr') in regions 0, 6, and 7, by the Input shift register (via SelISR') in region 1, and by shared memory data (via EnShareRd') in regions 2, 3, 4, and 5. Do we need bus pullups (about 20K) for this tristate bus? I don't think so because there is always something driving it and the only customer is the EPROM. Bits SLA09-SLA12 are generated by the "Address translation ROM" a 256 by 4 bipolar PROM. The eight PROM inputs are the high 7 CPU address bits (SA09-SA15) and a main cpu parallel port bit called, at present. PIOBX. This PROm is very simple. In regions 0, 1, 6, and 7 (the region is selected by SA13-SA15) it passes SA09-SA12 through to the outputs. In regions 2 and 3 it provides "wired address A" which selects a region of the EPROM whose data is the same as address bits SLA01-SLA08. THis is the straight through translation table used for region 2 and 3 untranslated reads from shared memory. In regions 4 and 5, the PROM provides either "wired address B" or "wired address C". The selection is made by PIOBX. These addresses select one of two mu-law to linear translation tables in the EPROM which are used for translating a mu-law coded sample in shared memory to a linear coded 16 bit sample. We're unsure of the contents of the tables, but one will be a straightforward mu-law to linear conversion and the other will be the same, but will also incorporate some loss, perhaps 5 dB. We suspect that in a conference call, the conference bridge attenuates all parties somewhat to reduce the likelihood of clipping. Maybe this isn't needed, but we've goot room in the EPROM. Shared memory arbitration: All slave CPU references to regions 2, 3, 4, or 5 generate requests for the shared memory via the signal SelShare'. This signal is inverted to SlaveReq and goes to the main system cascade mode DMA channel. It also directly sets a shared-reference-pending flop. When set, this flop causes the slave system to become Not-Ready waiting for the Shared memory. The arrival of SlaveDAck', after perhaps several flop stages providing wait states for slow memory, directly clears the pending flop and causes, after a Ready-Synchronization, the cpu to go Ready and complete the cycle. The Ready synchronization flop is needed because the 8284A internal ready synchronizers are in use by the main CPU and are not available. Is it really needed? SelShare', this signal that sets the slave system to Not-Ready, is clearly SysClock synchronous since it arrives on the heels of SlaveALE. SlaveDAck' while perhaps not synchronous as it emerges from the DMA controller, is synchronized by the wait state stages. How many wait state stages are needed? The shared memory access delay, once SlaveDAck appears, is about 150 ns (or less!). The data must travel through an 8282 latch and then through the address access time of the slave system EPROM, which might be 450 ns if we buy slow 2764s. Must the disappearance of SlaveDack' immediately cause disappearance of the synchronized version (via a bypass gate or by clearing the synchronizer stages)? The danger is that if a new SelShare' appears while the trailing edge of an "old" SlaveDack' is still working its way through the wait state stages, the slave system might incorrectly be held in a Ready state even though the shared memory really wasn't available (The "new" SlaveDack' having not arrived). Let's think about consecutive and non-consecutive shared memory cycles. If it has been a long time since the last shared cycle there is no problem. The LAST shared cycle is no problem, since the trailing edge of SlaveDAck' acts to hold the slave system Ready, which is where it should be. N.B. Given a shared cycle, SlaveDAck' always lasts LONGER than SelShare', so the flop will end up in the Ready state, as it should. The question is if it lasts so long as to overlap the beginning of the next shared cycle. It seems to me that any possible intervening cycle must last at least 4 T cycles, since that is the bus cycle time. I'm changing the drawing as follows: The input to the wait state stages was SlaveDack'. It is now gated with SelShare' using an all-bubbles AND-gate. Provided that there are no more than 4 wait state stages, the reset input will go high before SelShare' returns, if it should happen to go away. Control signals generated by the "control PROM": The inputs are SA15-SA13, which determine the region, SlaveDT, which determines whether a read or write is in progress, and Mode. SelISR' - active in region 1 only (SlaveDT and Mode ignored) SelOSR' - active in region 6 only (SlaveDT and Mode ignored) EnLocalAddr' - active in regions 0, 6, and 7 (SlaveDT and Mode ignored) Actually, one doesn't need a memory address when doing a write, so SlaveDT could be used, but we want to always have some source for SLA1-8 to avoid the need for bus pullups... EnShareRd' - active in regions 2, 3, 4, and 5. (SlaveDT and Mode ignored) Again, one doesn't need... SelShare' - active in regions 2, 3, 4, and 5. (SlaveDT and Mode ignored) Looks like EnShareRd'=SelShare', so we can save a bit NewMode - NewMode = (Region 4 OR Region 5) AND (NOT Mode) AND (NOT SlaveDT) We have no plans to do writes in Region 4 or 5, but it seems a good idea to keep the Mode Flop reset whenever possible. There are three left over output bits, assuming EnShareRd' is eliminated. -Larry PS There are two left over output bits, I've created PreShareWr', which is just SlaveDT'. *start* 01209 00024 US Date: 14 Feb. 1982 4:44 pm PST (Sunday) From: Stewart.PA Subject: ETP redesign files To: VoiceProject^.pa [Indigo]ETP>ETPArch.dm contains: ETPArch01.sil - System busses of Rev-Ac design ETPArch02.sil - System busses of Rev-Ac design modified for Dynamic RAM ETPArch03.sil - System busses for a possible shared memory design (Not exactly the one being used) ETPArch04.sil - Slave system busses, closer, but also not exact ETPArch05.sil - Slave CPU data paths (truth of current proposal) ETPArch06.sil - Slave CPU control (truth of current proposal) ETPArch07.sil - Dynamic RAM array ETPArch08.sil - Dynamic RAM control, refresh, and cascade arbiter ETPArch10.sil - Some notes on the Address translation ROM Sil.lb9 - A version of ETPLB9.sil with macros added for DRAMs, 2964, 2764, etc. This is the correct library EXCEPT it has the old SLC macro in it. The correct SLC macro is in [Indigo]ETP>ETP-Rev-Ad.dm in ETPLB9.sil. There are also Severo's files on [Indigo]ETP> ETPSlvAddressing.sil ETPBigPicture.sil These are not compatible with the [SIL] user.cm section on the Lark Hardware disk as they use some other collection of fonts. *start* 01589 00024 US Date: 14 Feb. 1982 4:57 pm PST (Sunday) From: Stewart.PA Subject: Parts to buy for Etherphone prototypes To: VoiceProject^.pa cc: McCreight, Thacker, TonyWest Bipolar PROMS: We need two sizes: 32x8 and 256x4 with Tri-state outputs 512x8 are handy, so we should have some in stock. Delay lines: We need one, with TTL compatible inputs and outputs and 10 ns taps. PROM information: All seem to be (within their groups) pin compatible although the programming is different and some are default-high and some are default-low. We can program all of these with available ProLog stuff. 32x8 Tristate, 16 pin dip Signetics N82S123 (50 ns) TI TPB18S030 (25 ns) National DM74S288 (in stock, adequate supply?) MMI 6331-1 Harris HM7603 (many in stock, but hard to get more?) 256x4 tristate, 16 pin dip Signetics N82S129 (50 ns) TI TPB24S10 (35 ns) National DM74S287 MMI 6301-1 Harris HM7611 Fairchild 93427 (a few in stock, what are they used for?) 512x8 tristate, 20 pin dip Signetics N82S147 (45 ns) (already use, but I can't find any in stock) TI TPB28S42 (35 ns) Harris HM7649 National DM74S472 If left to myself, I'd buy the TI parts because they are the fastest. There might, however, be some virtue in using types we already have. I think we are set for the prototypes, if we don't goof up too many times. The 256x4 Fairchild parts are in shortest supply in Mike's lab. There are about 4 or 5 of them... We should buy 25 each or so of the three shapes of TI parts if they are not much more expensive than the ones we already use around here. *start* 01771 00024 US Date: 14 Feb. 1982 5:10 pm PST (Sunday) From: Stewart.PA Subject: Things to do and questions To: VoiceProject^.pa cc: Stewart Give the list of PROM types to Joe Patti to find out how much they cost. I'd also like to know how fast they all are, but I couldn't find data sheets. Maybe Joe or Mike can get data books for us. I'm not sure where to look for delay lines. Eric Rawson's group might have a couple I think. Will it take less time to clean all the pin numbers etc off the Rev-Ad prints or to draw them all over again? Figure out DRAM write timing. Do the RAMs need the data valid at the time CAS starts? Will we meet that for all sources of writes? I suspect that the main and slave 8088s and the SLC are OK, because they begin to drive the data bus at the same time they assert Wr'. The problem may be DMA Encryption, if the DMA chip asserts IORd' to the Encryption chip at the same time as MWr' to the memory, you have to wait until the encryption chip access time is past before the data is valid. That might be too late. Check WD2001 data sheets. Can we delay the start of a write in this case? How many wait state stages needed for shared memory references from Slave CPU? Do we need to buy fast 2764s for the slave EPROM? Ask Intel whether 8088 Word Reads always happen in consecutive cycles, or whether an instruction fetch can creep in. Ask Intel how the 8088 executes an OUT WORD instruction. Does it keep the port address fixed or does it increment? This could speed up DES key loading, for example... Check state of Belleville design. Count number of control signals needed. Do we need all of it? What can be left out? We should definitely add Aux In and Aux Out for plugging in radios, etc. -Larry *start* 01260 00024 US Date: 15 FEB 1982 0010-PST From: GONSALVES.PA Subject: "Voice on an Ethernet: an Experimental Study" - draft report To: voiceInterest^, CSL^ cc: baskett, dalal, shoch, gonsalves Title: Packet-Voice Communication on an Ethernet Local Computer Network: an Experimental Study. Author: Timothy A Gonsalves Abstract: Local computer networks have been used successfully for data applications such as file transfers for several years. Recently, there have been several proposals for using these networks for voice applications. This paper describes a simple voice protocol for use on a packet-switching local network. This protocol is used in an experimental study of the feasibility of using a 3 Mbps experimental Ethernet network for packet-voice communication. The study shows that with appropriately choosen parameters the network is capable of supporting voice communications with acceptable quality up to throughputs of 95% of the network capacity. Filed on: [ivy]voicePaper.press Remarks: The measurements were performed on net #3 in Bldg. 35 over the past few months under the aegis of Forest Baskett. This is not part of the Voice Project. Criticisms, comments and suggestions are welcome. Tim Gonsalves. *start* 02499 00024 US Subject: WD2001 design PrintForm: WindowLetter From: Stewart.pa To: Western Digital Date: 16 Feb. 1982 11:39 am PST (Tuesday) Ron Benne Western Digital 2445 McCabe Way Irvine, CA 92714 16 Feb. 1982 Dear Mr. Benne: We have incorporated the Western Digital WD2001 Data Encryption Device into a microprocessor system using a 5 MHz Intel 8088 processor and 8237A-5 DMA controller. Attached to this letter is a drawing og the relevant parts of the design. The application we have requires the high speed enc ryption of 160 byte blocks of data using the electronic code book form of DES. The "encryption driver" software uses programmed I/O to load the WD2001 key register and to set up each of two channels of the 8237 DMA controller. The actual encryption of the 160 byte block is done by DMA, using one DMA channel to load data from the memory system into the WD2001 and another DMA channel to write cipher data from the WD2001 back to memory. We operate the WD2001 using the "PCLK" output of the 8284A clock generator for the 8088. This signal is a square wave at 2.5 MHz. The observed speed of operation in the above mode is 640 Kbps. This figure is derived from an observed time of 2 milliseconds per 160 byte block including key load and DMA setup time. The software overhead appears to be about 200 microseconds, leaving 1.8 milliseconds for DMA encryption. Ignoring the software overhead, the peak encryptin rate is about 710 Kbps. This figure is about half what I would expect for the WD2001. Referring to the schematic, we have connected the Data-In-Request and Data-Out-Request ouputs from the WD2001 directly to the DMA Request pins of the 8237. Placing the 8237 transfers in "Demand" mode or "Single-Transfer" mode does not make any difference in throughput. The problem seems to be with the Data-Request/Data-Acknowlege handshaking. When the DMA controller asserts, say, IORd' to the WD2001, the Data-Out-Request goes inactive for the duration of the IORd' This may be happining at such a point in the 8237 cycle that the DMA controller fails to get consecutive bus cycles, but instead must rearbitrate with the processor cor each one-byte transfer. If this is the reason, it could be fixed with external logic that kept DMA request valid for 8 IORd' or IOWr' pulses. The beauty of the WD2001, however, is the minimum amount of external logic needed. Have you any advice? Sincerely, Lawrence C. Stewart cc: S. Ornstein *start* 00545 00024 US Date: 16 Feb. 1982 3:26 pm PST (Tuesday) From: Stewart.PA Subject: Western Digital information To: VoiceProject^.pa Reply-To: Stewart The most useful fellow is Ron Benne at 714-557-3550 The new part numbers are: WD2001E-30 3Mhz ceramic part WD2001F-30 3 MHz plastic WD2001F-30 quantity 1-24 is $75 WD2001F-30 quantity 25-99 is $67.50 New (lower) prices are in the works somewhere. These new part numbers were sent out to the dealers and sales force about 10 days ago. If the reps haven't heard, call Ron Benne *start* 00956 00024 US Date: 18 Feb. 1982 4:24 pm PST (Thursday) From: Stewart.PA Subject: Intel Questions To: VoicePRoject^.pa cc: Stewart Dear Intel, Prices, availability, and data sheets for 8 MHz 8088 (small quantity (5) now) 8088 Ready line testing. The data sheet says: Tell me about Ready line synchronization requirements. Detail: Ready falling edge is tested 8 ns after falling edge of clock at beginning of T3. (TRYLCL). Is it tested on every sysclock or only that one? 8284A last synchronizer is falling edge clocked. TRYHCH says ready going high must occur 2/3 TCLCL-15 ahead of clock rising edge during T3/TW TCHRYX says 30 ns hold time of Ready after clock rising edge during T3/TW. Can an 8088 word read or write be interspersed with other kinds of cycles (like instruction fetch?) Does an 8088 OUT word instruction increment the address? IN word? When will fast 2764s be available? Does Intel make a 2564? -Larry&Severo *start* 00554 00024 US Date: 18 Feb. 1982 4:31 pm PST (Thursday) From: Stewart.PA Subject: Delay lines To: VoiceProject^.pa We need a TTL compatible delay line for the memory control. It should have 10 10 ns taps (total 100 ns) and come in a 14 or 16 pin DIP Information gleaned from 81-82 EEM: Bel Fuse Inc. Page 2440 14 pin dip 10 ns/tap pn 0447-0100-10 Hytek Microsystems, Page 2461 14 pin dip 10 ns/tap pn HY-5010-100 Kappa Networks Page 2464 14 pin dip 10 ns/tap pn CC101 Technitrol Page 2470 14 pin, 10 ns/tap pn TTLDD100 *start* 00524 00024 US Date: 19 Feb. 1982 11:31 am PST (Friday) From: Stewart.PA Subject: Intel 8088 answers To: VoiceProject^.pa From Ron Saltzman, 408-987-8086 8088-2, $8 differential for quantity one (e.q. $17) $5 differential at quantity 100 and up, available now. Data sheet in mail When will fast 2764s be available? now. There are 200, 250, 300, and 450 ns parts Intel claims to have 80% market for 64 K EPROMs, 2564 makers are switching over to 2765 pinout. Does Intel make a 2564? no -Larry&Severo *start* 00305 00024 US Date: 19 Feb. 1982 3:44 pm PST (Friday) From: Stewart.PA Subject: More 8088 answers To: Voiceproject^.pa "God couldn't stop an 8088 word reference once it gets going." 8088 does increment OUT word instruction For ready considerations, see Ap note 67 in the 8086 users manual. *start* 00526 00024 US Date: 19 Feb. 1982 3:39 pm PST (Friday) From: ornstein.PA Subject: Intel 2764 Prom Blowing To: VoiceProject^ Reply-To: ornstein McCreight and I determined we need the following Pro-Log parts: 1. Personality Module PM9074 $580 2. Pin-Out Adapter PA28-2 $190 3. Configurator 8K x 8 (EH) $35 Also Tom Chang will have to modify the blowing program. I spoke to him and he's willing and able to do so (over next few weeks) and we're to let him know when we get the parts so he can test it. Severo *start* 00496 00024 US Date: 17 Feb. 1982 2:34 pm PST (Wednesday) From: Stewart.PA Subject: Re: PreRelease of NetTopology In-reply-to: Hoffarth.wbst's message of 16 Feb. 1982 5:52 pm EST (Tuesday) To: Hoffarth.wbst cc: VoiceProject^.pa Parc: Do you want Nest-Net (173##) on the internet topology. Might as well. Nest-Net is a 1.5 Mb (Half speed 3 Mb) Ethernet in Building 35 at PARC. It has two hosts, no servers, and connects to Net 3 via an Alto I gateway named Dahlia2 (Host 3#112#). *start* 01054 00024 US Date: 19 Feb. 1982 12:01 pm PST (Friday) From: ornstein.PA Subject: Programming Intel 2764's To: McCreight cc: Stewart, ornstein Larry and I have determined that we need to use 250 ns. access 64K EPROMS for our second processor. TI will not make a fast (faster than the unacceptable 450 ns. current version) 2564 until next year. Intel claims that 250 ns. 2764's are available now. So we need to be able to program them and I need to know just what to order - and anything else that must be done to enable us to program them. Help! Last time we talked, I understood we needed: 1. a 2764 Personality card (costing about $550 as I recall). Is there a part number I can find out from somehwere? Is it sensitive to which speed chips we get? 2. Also I gather we need two other items: a) a Configurator b) a pin-out-adapter How do I find out specifically what to order and from whom? Is that all, and if not, what else? What else will have to be done to the driving software to make it all work? Advice requested. Severo *start* 05980 00024 US Date: 26 Feb. 1982 6:14 pm PST (Friday) From: Stewart.PA Subject: 8203 timing To: VoiceProject^ cc: Stewart The maximum operating frequency of the 8203 is 25 MHz. An 8 MHz 8088 needs a 24 MHz crystal. A convenient multiple of 2.94 Mb (Ethernet) is 23.53 MHz. Let's suppose we run everything but the Audio off 23.53 MHz. The CPUs will run at 7.843 MHz... Tp Clock period 42.5 ns Trc memory cycle 10 Tp -30 395 Tref refresh period 264 Tp-288Tp 11.22 us to 12.24 us (<15) Tcr Rd', Wr' to RAS' min Tp+30 72.5 max 2Tp+70 155 Tcc Rd', Wr' to CAS' min 3Tp+25 152.5 max 4Tp+85 255 Tca Rd', Wr' to SAck max 2Tp+47 132 Tcx CAS' to XAck min 5Tp-25 187.5 max 5Tp+20 232.5 When refresh preceeds Tcs CAS' to SAck min 5Tp-25 187.5 max 5Tp+40 252.5 Reads: TI 150 ns RAMs with 100 ns CAS access: Trldv = 252.5 min, 355 max Mot 150 ns RAMs with 75 ns CAS access: Trldv = 227.5 min, 330 max Writes: TI 150 ns RAMs with 100 ns CAS access have 60 ns data hold time Mot 150 ns RAMs with 75 ns CAS access have 35 ns data hold ime For reads, data buffers and delays in the control signals hurt. READS: Trldv for 7.843 MHz 8088, no accounting for latch/buffer delays 2Tclcl-Tclrl-Tdvcl = 255-100-20=135 With 1 wait state: 262.5 With 2 wait states: 390 Alternate config. (Early read) 2Tclcl+Tchcl-Tchll-Tdvcl-Logic=255+42.5-55-20-18 = 204.5 With 1 wait state: 322 With 2 wait states: 459.5 Must use 2 wait states even with Early Read, so why use it? WRITES: This ties Wr' line to Wr' pin of 8203 (No ALE triggered write) Beginning of writes: The data must be there by when the RAMs need it. Twldv for 7.843 MHz 8088 = Tcldv-Tclctv = [10..70]-[10..60] = [-50..60] Data buffers make things worse (lengthen Twldv) Wr' buffers make things better (delay Wr') Twldv must be less than Tcc min - RAM setup = 152.5 for both TI and MOT It is. End of writes: The data must still be present when the RAMs grab it Twhadv for 7.843 MHz 8088, no accounting for latch/ buffer delays Data buffers help by keeping the data longer Wr' buffers hurt by delaying Wr' 2Tclcl+Tclch+Tchdx-Tclctv = 255+85.4+10-70=280.4 With 1 wait state: 407.9 With 2 wait states: 535.4 These times must be greater than Tcc max for 8203+ RAM hold time TI 150 ns RAMs with 100 ns CAS access have 60 ns data hold time Mot 150 ns RAMs with 75 ns CAS access have 35 ns data hold ime Totals: TI = 255+60 = 315 MOT = 255+35 = 290 Looks like 1 wait state is needed. READY Ready line outputs: XAck will follow Rd' or Wr' by Tcc+Tcx, so min = 152.5 + 187.5 = 340 max = 255 + 232.5 = 487.5 Writes: Wr' starts Tclctv (max 70) after the falling edge of clock at the end of T1. To ensure 1 wait state, the ready input to the 8284A must be valid 107.5 ns later, 35 ns before the rising edge of clock in T3. To ensure 2 wait states, the ready input to the 8284A must be valid 235 ns after Wr'. To ensure 3 wait states, ready at the 8284A must be valid 362.5 ns after Wr'. wait states Ready at 8284A after Wr' 0 -20 (50 ns after beginning of T2 or 117.5 ns after the addresses valid) 1 107.5 2 235 3 362.5 4 490 So using XAck, with worst case 8088 Wr' timing (and no buffer allowances) would cause usually 4 and sometimes 3 wait states. Reads: A read could begin (Tclrv) as late as 100 ns after the falling edge of clock at the end of T1. The corresponding table for read therefore has 30 ns smaller numbers. wait states Ready at 8284A after Rd' 0 -20 (50 ns after beginning of T2 or 117.5 ns after the addresses valid) 1 77.5 2 205 3 332.5 4 460 5 587.5 So using XAck, with worst case 8088 Rd' timing would cause usually 4 and sometimes 5 wait states. SACK The data sheet lists Tca max as 132 ns (for our system). The minimum isn't shown. Ron Saltzman (Intel) says that the minimum is Tp+20, more or less. SACK roughly corresponds to RAS! SAck preceeds CAS by 2Tp + 38 at most, = 123 ns. Add to this the CAS access times, and SAck preceeds data by TI 150 ns RAMS: 223 ns MOT 150 ns 198 ns Assume we use the slow ones (TI) for now. Reference edge, end of T1 0 Earliest Rd', Wr' 10 Worst ALE 12.5 Early read starts (worst ALE) 30.5 (18 ns gate delay) Lastest Wr' 70 Earliest SAck (best 8088, best 8203) 72.5 (norm. read, write) Rising clock, in T2 85 Earliest SAck (worst ALE, best 8203) 93 (early read) Latest Rd' 100 Falling clock, end T2 127.5 Latest SAck for 1 wait state 139.5 Latest SAck (early read, worst 8203) 172.5 (early read) Ready valid at 8284A for 1 wait state 177.5 Latest write SAck (worst 8088, worst 8203) 202 (normal write) Rising clock, in T3 212.5 Latest read SAck (worst 8088, worst 8203) 232 (normal read) Falling clock, end T3 255 Latest SAck for 2 wait states 267 Ready valid at 8088 for 1 wait state 270 Ready valid at 8284A for 2 wait states 305 Rising clock, in TW1 (t4) 340 Data valid, 1 wait state read 362.5 Falling clock, end TW1 (t4) 382.5 Ready valid at 8088 for 2 wait state 397.5 Ready valid at 8284A for 3 wait states 432.5 Rising clock, in TW2 (t4) 467.5 Data valid, 2 wait state read 490 Falling clock, end TW2 (t4) 510 Ready valid at 8088 for 3 wait state 525 Ready valid at 8284A for 4 wait states 560 Rising clock, in T4 595 Data valid, 3 wait state read 617.5 Falling clock, end T4 637.5 My best guess is that we should use Early read and double synchronize SAck using SysClock' and feed it directly to the 8088 ready input. bypassing the 8284A. The second stage can use the 8284 internal flop however... SAck will nominally catch the SysClock' at milepost 255 and again at 382.5, arriving at the 8088 on time for 2 wait states. It looks like we are guaranteed that all SAcks occur before 255 (The last one is at 232) so, if the 8203 is idle when the 8088 wants it, there will be 2 or fewer wait states. Those SAcks which happen before 127.5 will only incur one wait state! *start* 07426 00024 US Date: 25 Feb. 1982 3:15 pm PST (Thursday) From: Stewart.PA Subject: Lark main memory control To: VoiceProject^ Lark main memory consists of 8 64K dynamic rams to hold the bits, an Am2964B dynamic RAM controller to generate the address, a delay line to generate RAS and CAS timing, and some small scale logic to sort out the various kinds of cycles. We have ordered RAMs with 120 ns RAS access time, but suspect that 150 ns chips would work in all cases. TImings below are worked out for the 150 ns chips. A DRAM cycle consists of RAS, which must last at least 150 ns. XXX ns after RAS, CAS starts and lasts until slightly after RAS (They can end at the same time). If the WriteEnable RAM pin is activated at the start of CAS, the cycle in a write. In this case, the RAM output remains in tri-state and data at the input is latched at the beginning of CAS. During a read (no WriteEnable), the chip output leaves tristate after the start of CAS and remains enabled until the end of CAS. A refresh cycle does not need a CAS, but may have one, in which case it is a normal read as well. Reads: Main CPU and SLC These devices get special treatment on reads. When IO is low (a memory cycle), and PDT is low (a read), then the signal EarlyRead' is asserted at the end of ALE'. This serves to get the memory cycle started a bit early to compensate for the fast memory requirements of the SLC running at 2.94 Mbps. When the actual Rd' signal comes along, it both takes over the job of providing RAS and clears the EarlyRead flipflop. PDT and IO have the same timing as addresses, they can be safely latched with ALE. Whenever there is an ALE, there is guaranteed to be a cycle coming, and PDT will always accurately reflect whether it will be a read or a write. We want the system to work with a 2.94 Mbps SLC. The timing requirements for memory read are: SLC: Tlc1 NALE end to NSTROBE 45 Trd NRD low to data valid 105 The formula is Trd = 3/2T - K6, where K6 <= 150 ns. Frank Nelson mentioned that if its gonna work at 12 MHz, then K6 is probably smaller, but probably not as small as 75. So from the end of ALE to read data required is at least 150 ns and possibly as much at 200 ns. 8088 End of ALE is <= 85 ns after rising edge of SysClock in T1. Data must be valid 30 ns before the falling edge of the clock at the end of T3. The total time is all of T3 less 30, plus all of T2, plus 1/3 of T1 less 85 = 351 ns. Clearly the 8088 is no problem. Note: We can use the (faster) 8287 and invert MD by inverting the EPROM data. The logic chain between ALE' and data on the AD bus is: Typ Max S74 6 9 S20 3 4.5 S02 3.5 5.5 2964B 14 20 Chips 120 120 8287 xcvr 12 22 200 pf on AD bus 158.5 181 8286 xcvr 22 32 200 pf on AD bus 168.5 191 The consequence of this logic is that the 3 Mbps SLC will work only if K6 is somewhat smaller than specified for the 1.5 Mbps part. We should ask MEC I guess. It may also be possible to bypass the 2964 for RAS & CAS, using it only for address multiplexing. Slave CPU The slave CPU starts memory reads by asserting Rd' after aquiring the main busses. There are sufficient wait states in the SlaveDAck path to absorb the delay between Rd' and the Slave data bus. The logic chain between GSlaveDAck and data on the SAD bus is: Typ Max LS125 15 25 S20 3 4.5 S02 3.5 5.5 2964B 14 20 Chips 150 150 8282 latch 22 32 200 pf on AD bus 2764 250 250 Tot: 457.5 487 There are two D-flop stages between GSlaveDAck and SlaveReady, and data is sampled a full cycle after that less 30 ns for an available budget of 570 ns. DMA (DMA reads memory and writes into Encryption chip) 8237A-5 specs TDCL clock high to Wr' or Rd' low max 190 TDCTR clock high to Rd' high max 190 TDCTW clock high to Wr' high max 130 The DMA controller must be set up with normal reads and extended writes. In this mode, MRd' and IOWr' are asserted at the same time, and for about 400 ns each. Memory data will be available 150 ns after MRd' starts. Data is required at the WD2001 130 ns before the end of Wr'. Thus 280 ns of the available 340 is used up. These totals must be less than 400 ns, the nominal Rd' and Wr' widths. Typ Max TDCL 190 190 LS125 15 25 S20 3 4.5 S02 3.5 5.5 2964B 14 20 Chips 150 150 8286 xcvr 22 32 200 pf on AD bus 2001 setup 130 130 TDCTW -130 -130 (negative because AFTER 2nd clock Tot: 397 427 Chips 150 150 8287 xcvr 12 22 200 pf on AD bus Tot: 387 417 Chips 120 120 8286 xcvr 22 32 200 pf on AD bus Tot: 367 397 Chips 120 120 8287 xcvr 12 22 200 pf on AD bus Tot: 357 387 The DMA controller cannot use normal writes because the minimum write pulse width from the 8237A-5 (140 ns) is too short for the minimum write pulse required by the WD2001 (175 ns). Writes: Since data is required at the memory chips when CAS starts (setup time = 0), CAS is delayed for writes by switching taps on the delay line. For reads, CAS follows 40 ns behind RAS. For writes, the delay is 100 ns. Main CPU For an 8088 (5 MHz), the leading edge of Wr' follows the rising edge of SysClock by 10 to 110 ns. Data is valid after the same edge by 15-110 ns. One assumes these delays track, but even if they don't, the timing still works. There is one bus transceiver between AD and MD, and several gates between PWr' and the RAS pin of the chips, plus the 100 ns delay. We'll count delays to the control signal as negative and delays to the data as positive. The answer should come out negative, the amount by which the data is late at the chips should be negative. I've reversed the 8286 typical and worst cases because they work the other way... Typ Max LS125 -15 -25 S20 -3 -4.5 S02 -3.5 -5.5 2964B -14 -20 CAS -100 -100 8286 xcvr 32 22 200 pf on MD bus 8088 data 100 100 assumes worst data, best Wr' Total: -3.5 -33 SLC We have been told by El Segundo that the SLC starts to drive the data lines at the same time Wr' is asserted. The SLC writes will work unless the differential skew between the data and Wr' exceeds 100 ns. Slave CPU The slave CPU, when requesting a shared memroy write cycle, typically asserts SlaveWr' and the data (on the slave side of the MD drivers) and then waits for GSlaveDAck. This signal simultaneously enables both Wr' and MD' so again, the skew will not exceed 100 ns. DMA (DMA reads from Encryption chip and writes to memory) The 8237A-5 must be operated with normal reads and normal (short) writes. Because the minimum width MWr' from the 8237A-5 is 140 ns, and this signal directly generates RAS, and the minimum RAS for 150 ns chips is 150 ns, we have extended RAS in this case by ORing in the 10 ns delay line tap with RAS, gated by EncWrDAck. The WD2001 delivers data 220 ns after Rd' is activated. The data must get to the memory 100 ns after Wr' is activated. By using the 8237A-5 modes described above, the Wr' pulse is delayed 200 ns after the Rd' pulse. Thus the data from the WD2001 arrives 20 ns after RAS, but is not needed until 100 ns after RAS and all is well. The end of RAS is extended slightly to meet the specs for 150 ns memroy chips, but that is a side issue. 8237A-5 -200 -200 due to delay of control signal LS125 -15 -25 S20 -3 -4.5 S02 -3.5 -5.5 2964B -14 -20 CAS -100 -100 WD2001 220 220 8286 xcvr 32 22 200 pf on MD bus Tot: -83.5 -113 *start* 03471 00024 US Date: 25 Feb. 1982 3:50 pm PST (Thursday) From: Stewart.PA Subject: Chip counts for Lark To: Voiceproject^.pa 40 pin packages: 8088 1 Main CPU 8237A-5 1 DMA controller 8274 3 RS-232 controller 9513 4 Programmable Timer SLC 5 Ethernet controller Am2964B 6 Ram address select and refresh counter 8088 9 Slave CPU Total: 7 28 pin packages: WD2001F-30 4 Encryption 8259A-2 4 Interrupts I2764 7 Main CPU EPROM I2764 9 Slave CPU EPROM (250 ns) Total: 4 24 pin packages: 20 pin packages: 8282 1 Main CPU and SLC address latch 8282 1 8237A-5 address latch 8286 1 Memory data bus transceiver 8287 2 Offboard bus transceiver LS240 2 Offboard control line driver LS240 2 Offboard control line driver/receiver LS374 7 Onboard PIO 8282 9 Shared memory address (high) 8282 9 Shared memory address (low) 8282 9 Shared memory data 8282 9 Local address latch LS244 9 Shared Write Data LS244 9 Shared Zero LS299 9 Input shift register Total: 14 18 pin packages: 8284A 1 Clock generator Total: 1 16 pin packages: N138 2 IO Address decoding 32x8 rom 2 Main system control signals N123 3 Watchdog timer and reset pulser PLAT 3 Timing components for N123 PLAT 5 Ethernet terminators and pullups LS157 6 Cascade DMA arbiter S158 6 CASIn' generator LS163 6 Refresh burst counter 64K RAM 8 0 64K RAM 8 1 64K RAM 8 2 64K RAM 8 3 64K RAM 8 4 64K RAM 8 5 64K RAM 8 6 64K RAM 8 7 LS166 9 Output shift register 32x8 rom 10 Slave control ROM 254x4 rom 10 Slave address translation ROM LS157 10 Controlled by Mode flop Total: 20 14 pin packages: 15MHz 1 8088 clock 12.288MHz 4 Audio clock 11.765MHz 5 Ethernet clock 100ns 6 Memory control delay line 1488 3 RS-232 receiver 1489 3 RS-232 driver Subtotal: 6 LS00 sections 10 SlaveReq' OR SynGSlaveDAck' Subtotal: 1 LS02 sections 6 RefDAck' AND RefReq' 10 ShareDLE 10 SelOSR AND SlaveRd' 10 OSRClk' Subtotal: 1 S02 sections 6 RASIn' 6 IO' AND PDT Subtotal: 1 LS04 sections 1 ALE clock for main cpu address latch 1 AEN' output enable for DMA address latch 2 SelROM' 2 SLCDAck 2 Reset' 2 SysClock' 3 SIOInt to interrupt controller 5 SelSLC 6 EncWrDAck 6 CascadeReq' 9 SA13' 10 SynTSN' 10 SlaveALE' Subtotal: 3 (use 20 pin bus inverter?) LS08 sections 5 EtherIn 6 Tap20 AND EncWrDAck 6 RefCy AND SynTSN' 10 SLA00 AND SelShare' 10 AudShClk Subtotal: 2 LS11 sections 2 RelRAM' 3 Reset trigger 4 WD2001 pin A0 4 WD2001 pin CS' 6 CascadeReq' AND SynCasDAck' AND SysClk' Subtotal: 2 S20 sections 6 RAS (start memory cycle) Subtotal: 1 LS32 sections 2 EnOffBoardBus' 3 KickWDT OR Reset 7 SelPIO' AND Wr' 10 GSlaveDAck' 10 EnShareWr' 10 EnShareZero' Subtotal: 2 LS38 sections 5 XmtData' (Ethernet) Subtotal: 1 LS74 sections 3 ManReset SR flop (pushbutton debounce) 4 EncInt 5 SLCClk 6 EarlyRd' 6 RefReq 10 LinTblSel 10 Mode 10 pre SLA00 Subtotal: 4 LS125 sections 2 Rd' 2 Rd' 2 Wr' 2 Wr' 9 SlaveRd' 10 Rd' 10 Wr' Subtotal: 2 LS393 sections 4 T1Clk and multiples (12.288 MHz crystal) 6 Cascade DMA arbiter Subtotal: 1 8T09 sections 1 PALE -> ALE' 5 SLC pin R/W' -> PDT 5 SLCReady' Subtotal: 1 D-flop sections SysClk' 1 Hold synchronizer SysClk' 4 SynTSN synchronizer SysClk' 4 same SysClk' 6 SynCasDAck' SysClk' 10 SlaveReady SysClk' 10 SynGSlaveDAck' SysClk' 10 SynSlaveReq Subtotal: 1 Total: 29 *start* 01629 00024 US Date: 26 Feb. 1982 3:10 pm PST (Friday) From: ornstein.PA Subject: Timing of Ready for a 8 MHz 8088 To: Stewart cc: ornstein Here is my take. Does't agree? - - - - - - - - - Rdy to the 8284 must become TRUE at least 35 ns. before the LEADING edge of some CLK - say Clk n. (the 8088 timing diagrams make it look like it's before the TRAILING edge but the 8284, which presumably knows the truth, shows it before the LEADING edge). Assuming you make it in time, the 8284 then presents Ready to the 8088 on the TRAILING edge of Clk n. So far as the 8088 is concerned, it must see Ready true at least 68 ns. before the leading edge of a Clk. (If it comes through the 8284, it will be true two thirds of a 125 ns. period [=83 ns.] before the leading edge of Clk n+1 and so will be fine). In this case, the 8088 will take the data on the TRAILING edge of Clk n+1. For this to be reasonable, the Data must have been valid at least 30 ns. before that (Clk n+1) Trailing edge. From all this I conclude that the total Minimum delay from the time you say Rdy to the 8284 until the 8088 actually grabs the data is: 35 + 125 + 42 = 202 ns. 35 = lead time for Rdy to leading edge Ckl n 125 + 42 = time from LEADING edge of Clk n to TRAILING edge of Clk n+1. The data doesn't have to be valid until 202 - 30 = 172 ns. after Rdy. CONCLUSIONS: 1. It's OK to say Rdy to the 8284 anywhere up to 172 ns. before you really have the data in hand. 2. It's OK to say Ready to the 8088 anywhere up to 80 ns. (68 + 42 - 30) before you really have the data in hand. So far I can't reach Frank Nelson re. the SLC. *start* 01277 00024 US Date: 26 Feb. 1982 3:47 pm PST (Friday) From: ornstein.PA Subject: Timing of Ready for the SLC To: Stewart cc: ornstein I just got through to Frank Nelson and explained to him our needs. He says so far as he knows no-one has really used Ready INTO the SLC - but it should work...... Presumably Ready is synchronized within the chip - but it wasn't clear whether with just one or two flops..... He didn't have all the numbers but will look them up and either he or Jim Okazaki will call us back - probably not 'til next week (it being Friday afternoon and other things to do). There is a bit of a problem since inside the SLC particular phases of our "12 MHz" clock are used for sampling various things and from the outside, we can't tell WHICH subcycle is being used for what. They CAN, of course, tell us with respect to CLKOUT (and/or they can give us worst case numbers). He said of course it's OK to present Ready simultaneous with the data - but understood that we might want to try to play games with an early ready. I also asked him about writes and he understood the possible problems that could arise with early ready there also. More anon. One of them will probably call you next week - I impressed on him that sooner was better. S. *start* 01586 00024 US Date: 26 Feb. 1982 7:17 pm PST (Friday) From: Stewart.PA Subject: DMA Ready To: VoiceProject^.pa cc: Stewart Call the 8237A-5 clock PClock Ready setup to falling edge of PClock is 75 ns Ready hold after falling edge of PClock is 20 ns If PClock is the 1/2 speed output of the 8284A, then the transitions of PClock are delayed up to 22 ns after falling edges of SysClock. If Ready is high a setup delay before a falling edge of PClock, then the Rd' and Wr' signals will be removed following the next rising edge of PClk. In our proposed system, PClock is low and high for 255 ns each. On Reads, SAck from the 8203 preceeds valid data by at most 223 ns. I propose synchronizing SAck with PClock, then again with PClock'. These signals are 255 ns apart. Data (for a memory read) will be valid 223 ns after PClock, leaving over 287 ns for the write data setup time required by the WD2001. For writes (From the WD2001), we will use the normal read and compressed write facility of the 8237A-5. That will guarantee that WD2001 data (with it's slow access time) is present at the memory system before CAS starts. The end of this DMA write is not too critical. We could change the first synchronizer to use SysClock (which occurs about 60 ns ahead of PClock') to shorten it, but it doesn't seem important since it only saves potentially 200 ns on 1/4 of all DMA cycles. My feeling is that 1 IO wait state is adequate to permit the 8088 to talk to the 8237A-5 via programmed IO. Since IO cycles happen so rarely, 2 wait states wouldn't hurt much. -Larry *start* 00765 00024 US Date: 26 Feb. 1982 7:34 pm PST (Friday) From: Stewart.PA Subject: SLC Ready To: VoiceProject^.pa cc: Stewart The SLC ClockOut is high for 340 ns and low for 340 ns (1.47 Mbps) T = 680 The SLC ClockOut is high for 170 ns and low for 170 ns (2.94 Mbps) T = 340 Ready setup to rising edge of ClockOut is T/4 - 100 = (70 @ 1.5 mb)(-15 @ 3) Ready hold after rising edge of ClockOut is 0 Ready valid (e.g. low) by 5/4T - 140 after Rd' or Wr' =(710 @ 1.5)(285 @ 3) I propose synchronizing SAck with ClockOut', then again with ClockOut. These signals are (340 @ 1.5)(170 @ 3) ns apart. There are at least two cycles of Clockout between assertions of Wr' or Rd' so there is time for NotReady to get through the synchronizers. -Larry *start* 01097 00024 US Date: 22 Feb. 1982 12:26 pm PST (Monday) From: ornstein.PA Subject: Prom Blowing Info To: VoiceProject^ Reply-To: ornstein These are some notes from my whiteboard. There are three 64K EProms of interest - a 24-pin Motorola job, a 28-pin TI-2564, and a 28-pin Intel-2764. McCreight has been using the Motorola chip in his home project(s), but (for reasons I don't recall) it isn't the right thing for us. The TI2564 ($15 in reasonable quantities soon) is now available only in the 450 ns. speed. They say next year they expect to have a faster one. We need a faster one now. Intel already has 2764's in speed ranges down to 200 ns. We are ordering the 250 ns. ones for now. I don't know the price of those at that speed. The Motorola and TI chips take the same Personality card for the Prom Blower and we have that card already. The Intel chips take a different one and we've just ordered it. All three take different configurators and pin out adapters and we've ordered what we need for the Intel 2764. So presumably we are all set (or will be soon) for now. Severo *start* 01182 00024 US Date: 25-Feb-82 13:59:29 PST (Thursday) From: Murray.PA Subject: Crosstalk, Nancy Smith To: Stewart, Swinehart ------------------------------ Date: 25 Feb. 1982 1:48 pm PST (Thursday) From: Ladner.PA Subject: Crosstalk, Nancy Smith To: SDD^, OSBU-PA^.es cc: Vincent, Duvall, Spinrad Reply-To: Ladner In Palo Alto: March 2 at 1:30 PM in SDD Conference Room 200M In El Segundo: (will be announced soon) Speaker: Nancy Smith Topic: SDD's "Voice Box" and Digital Speech Processing Abstract: SDD's voice and telephone management project involves a small part of the field of digital speech processing. Nancy will briefly describe some aspects of digital speech processing, and define the scope of SDD's initial product in this context. She will describe the SDD voice product, the project plan, and the technical issues involved. Finally, She will discuss techniques of speech compression, and evaluate potential applications in the Star environment. She will play a tape of speech processed by several different compression schemes at bit rates ranging from 2.4 kb/s to 32 kb/s. ---------------------------------------------------------------- *start* 00368 00024 US Date: 3 March 1982 1:50 pm PST (Wednesday) From: Stewart.PA Subject: New B4 prints To: VoiceProject^.pa cc: Stewart Simon Lau sent me a new set of schematics for the Belleville box. It's occurred to me that the newest Lark design for the CPU board has everything needed to talk either to our own Analog stuff or to Belleville's box. -Larry *start* 00439 00024 US Date: 3 March 1982 2:13 pm PST (Wednesday) From: Swinehart.PA Subject: BCPL to C converter To: Duvall cc: Stewart, Swinehart I believe that your converter would help us quite a bit. Would you care to state the terms under which you'd part with a copy for use by our fair voice project? How about a free sample conversion of one or two of my files to help us evaluate its capabilities? regards, Dan Swinehart *start* 02013 00024 US Date: 4-Mar-82 16:15:57 PST (Thursday) From: verplank.pa Subject: Voice volumes To: voiceinterest^ Reply-To: verplank cc: verplank I'm looking for projections of voice mail volumes. Any one out there have better projections than these: from I. Richer et al., "Office communications and the Digital PBX" in COMPUTER NETWORKS 5 (1981). "...the voice system would be used only for short messages to individuals who did not answer their telephones or whose telephones were busy. In addition, the voice system would be used for some of the text messaes currently transmitted via the electronic mail system. On the basis of statistics gathered by BBN's computerized PBX, we estimate that approximately 1500 call attempts (including internal and external calls either receive no answer, or receive a busy signal, or are picked up by BBN's Message Center. If we assume that about two thirds of these are either repeated calls or would not be forwarded, then 500 messages per day would be stored by the voice mail system, or about one message per person per day, a reasonable quantity. ... "If we assume that each message lasts about 30 seconds of real time, that each message flows once into storage and once to the recipient, and that the peak traffic is twice the average, then the peak utilization and throughput are 2.1 erlangs and 130 kps, respectively, for this application. If this traffic were consolidated with the other traffic, three additional voice channels would be required for circuit switching or two additional channels for packet switching. "However now consider the storage requirements for these messages. According to the above data, approximately 250 minutes worth of messages are stored each day if the digitization rate is 64 kbps, so that approximately 100 megabytes of storage are needed! Clearly, standard pulse code modulation (PCM), which produces this high bit rate, is an inappropriate technology for a voice mail system in a moderate siz office. ..." *start* 08464 00024 US Date: 8 March 1982 1:06 pm PST (Monday) From: Stewart.PA Subject: Lark Ready Nets To: VoiceProject^ Lark main memory consists of 8 64K dynamic rams to hold the bits, an I8203 dynamic RAM controller to generate the address, to generate RAS and CAS timing, and to do refresh. There is some small scale logic to sort out the various kinds of cycles and to synchronize the 8203 acknowlege signals. Because the 8203 may be doing a refresh cycle when someone asserts Rd' or Wr', all bus masters must be prepared for wait states. There are the following kinds of cycles: Refresh -- automatically generated by the 8203 Reads: Main 8088 SLC Slave 8088 DMA controlled read from memory, (to WD2001) Writes: Main 8088 SLC Slave 8088 DMA controlled write to memory (from WD2001) There are two acknowlege signals that come out of the 8203: SAck' -- System Acknowlege, roughly coincident with RAS. SAck precedes CAS by no more than 123 ns. Using RAMs with CAS access times of 100 ns, (TI 150 ns parts), SAck' will precede valid data by no more than 223 ns. XAck' -- Transfer Acknowlege, data valid precedes XAck'. Most bus masters use SAck' as an early acknowlege signal. SAck is synchronized to the clock of the appropriate bus master, and by the time that is done and the master looks at the data bus, the data will be valid (This is what happens on Reads.). For Writes, the data must be valid at the time CAS starts; SAck indicates that a cycle has been started and that Wr' may be turned off whenever convenient. Timing names: Trldv - Read low to data valid (access) Tdvwh - Data valid to write high (setup) Tdvawh - Data valid after write high (hold) Trlrh - Read low to read high (width) Twlwh - Write low to write high (width) Main 8088 Ready 8088 times for 7.843 MHz 8088, no accounting for latch/buffer delays 8088 read access time: Trldv With 0 wait states: 2Tclcl-Tclrl-Tdvcl = 255-100-20=135 With 1 wait state: 135 + 127.5 = 262.5 With 2 wait states: 135 + 127.5 + 127.5 = 390 8088 Rd' width: Trlrh With 0 wait states: 2Tclcl-50 = 255-50=205 With 1 wait state: 165 + 127.5 = 332.5 With 2 wait states: 165 + 127.5 + 127.5 = 460 8088 Wr' width: Twlwh With 0 wait states: 2Tclcl-50 = 255-40=215 With 1 wait state: 165 + 127.5 = 342.5 With 2 wait states: 165 + 127.5 + 127.5 = 470 Twldv for 7.843 MHz 8088 = Tcldv-Tclctv = [10..70]-[10..60] = [-50..60] Data buffers make things worse (lengthen Twldv) Wr' buffers make things better (delay Wr') Twhdx: (Tdvawh) Tclch-30 = 55 Trhdx (read hold time) = 10 RAM reads: SAck' precedes valid data by no more than 223 ns. In addition, there is an 8287 (or 8286 bus transceiver between the RAM chips and the 8088, which adds an additional 22 (32) ns delay, for a total of 255 ns between SAck' and data valid at the 8088. SAck will be SAck' will be synchronized first by SysClock, then by SysClock', then by SysClock' again, and fed to the Ready input of the CPU. The double and a half synchronization provides a delay of 170 ns. There is an additional delay of 107.5 ns before the data must be valid at the 8088. The total is 277.2 ns. This array of synchronization is provided by feeding SAck first through the 8284, which provides SysClock followed by SysClock' synchronization, and then through another stage of SysClock' synchronization. The worst case SAck', given Early Read and a slow 8203, would occur 177.5 ns after the end of T1. The rising clock in T3 occurs at 212.5, permitting operation with 2 wait states (in the absense of refresh). RAM writes: Twldv for 7.843 MHz 8088 = [-50..60] 8286 bus transceiver = 32 ns (200 pf load) Total: [-18..92] Tcc min for the 8203 is 152.5 ns, so the CPU gets the data to the memories in plenty of time. ROM reads: We have a 250 ns ROM, and the addresses are valid 302.5 ns before data is required, so we could operate with no wait states. If we use the same ready synchronization scheme, then SelROM would have to become valid sometime before the rising edge of clock in T1 to avoid wait states. That is unlikely. It is sure to arrive before the rising edge of SysClock in T2, so ROM will run with no more than 1 wait state. IO to SLC: In this case, the SLC provides SLCReady', which can be directly fed to the 8284A. Using the additional SysClock' synchronizer adds an unneccessary wait state, but there is little IO to the SLC so it doesn't matter. IO to other devices. Other devices do not generate Ready. In order to give other devices Rd' and Wr' pulses of adequate width, we provide a wait state generator. Trldv Tdvwh Tdvawh Trlrh Twlwh 8255A-5 200 (150pf) 100 30 300 300 8259A-2 120 (100pf) 160 0 160 190 8274 200 (150pf) 150 0 250 250 WD2001F-30 220 (50pf) 130 60 !! 300 175 9513 160 (100pf) 80 0 160 150 8237A-5 140 (100pf) 160 30 200 160 So 1 wait state works in the worst case. By the same logic used for ROM reads, if the decoded addresses generate ready, then if the input to the 8284 is valid after the rising edge of T1 but before the edge of T2, then there will be one wait state. Two wait states would be perfectly acceptable since there are few IO references. There is a problem with write data hold time for the WD2001. The WD2001 needs 60 ns, and the 8088 can only guarantee 55. One possible solution is to hang the WD2001 on the memory bus (MD) instead of the processor bus (AD). This will provide additional hold time since the bus transceiver would delay the data from the 8088. This would require that we use non-inverting transceivers (e.g. 8286). The problem does not arise with the 2964 design since the processor runs at 5 MHz and the data hold time is tied to the clock. SLC Ready: SLC communication with the CPU was covered above. The SLC ClockOut is high for 340 ns and low for 340 ns (1.47 Mbps) T = 680 The SLC ClockOut is high for 170 ns and low for 170 ns (2.94 Mbps) T = 340 Ready setup to rising edge of ClockOut is T/4 - 100 = (70 @ 1.5 mb)(-15 @ 3) Ready hold after rising edge of ClockOut is 0 Ready valid (e.g. low) by 5/4T - 140 after Rd' or Wr' =(710 @ 1.5)(285 @ 3) I propose synchronizing SAck with ClockOut', then again with ClockOut. These signals are (340 @ 1.5)(170 @ 3) ns apart. There are at least two cycles of Clockout between assertions of Wr' or Rd' so there is time for NotReady to get through the synchronizers. DMA Ready: Call the 8237A-5 clock PClock Ready setup to falling edge of PClock is 75 ns Ready hold after falling edge of PClock is 20 ns If PClock is the 1/2 speed output of the 8284A, then the transitions of PClock are delayed up to 22 ns after falling edges of SysClock. If Ready is high a setup delay before a falling edge of PClock, then the Rd' and Wr' signals will be removed following the next rising edge of PClk. In our proposed system, PClock is low and high for 255 ns each. On Reads, SAck from the 8203 preceeds valid data by at most 223 ns. I propose synchronizing SAck with PClock, then again with PClock'. These signals are 255 ns apart. Data (for a memory read) will be valid 223 ns after PClock, leaving 287 ns for the write data setup time required by the WD2001. For writes (From the WD2001), we will use the normal read and compressed write facility of the 8237A-5. That will guarantee that WD2001 data (with it's slow access time) is present at the memory system before CAS starts. The end of this DMA write is not too critical. We could change the first synchronizer to use SysClock (which occurs about 60 ns ahead of PClock') to shorten it, but it doesn't seem important since it only saves potentially 200 ns on 1/4 of all DMA cycles. Slave CPU Ready Shared Read: Data from the main memories must travel through an 8282 from the MD bus to the SLA bus, then through a 250 ns EPROM to get to the SAD bus. Suppose we start with XAck, indicating valid data present on the MD bus, then synchronize it 3 times with SysClock' before presenting it to the Ready input of the Slave CPU. The triple synchronization would provide 255 ns delay, then there is an additional 107.5 before the data is needed at the 8088. Shared Write: Write data is valid at the inputs of the drivers to the MD bus at the time SlaveDAck is activated. SlaveDAck gates Wr'. In the 2964 system, once the shared memory was aquired, it stayed ready. In the 8203 system, the shared memory will go not ready for every cycle, since it may do refresh at any time.