Number: 1402

Date: 12-Jun-84 17':00':53

Submitter: Sannella.PA

Source: Mcfarland.henr

Subject: 10 Mnet problems': communications degrade over time

Assigned To: 

Attn: vanMelle

Status: Open

In/By: 

Problem Type: Performance

Impact: Serious

Difficulty: Hard

Frequency: Intermittent

Priority: Perhaps

System: Communications

Subsystem: Other

Machine: 

Disk: 

Lisp Version: 

Source Files: 

Microcode Version: 

Memory Size: 

File Server: 

Server Software Version: 

Disposition: 

Description: '
Date': 12 Jun 84 10':31 EDT'
From': Mcfarland.henr'
Subject': Lisp': 10 Mnet problems'
To': LispSupport.PA, Masinter.PA'
cc': McFarland.henr, KMatysek.henr, Boesl.henr, TBigham.es, AHenderson.PA'
Lisp-System-Date':  1-Mar-84 14':24':22'
Machine-Type': Dolphin'
'
'
Results of tests run with PUP.ECHOUSER':'
'
We ran PUP.ECHOUSER to our file server while observing the packets via a third system monitor (an 1100 on the 3Mnet running pupwatch). Communications seems to be fine at first (after a new sysout, or re-entering the environment). While the pup echos were running, I observed that the 10M machine started to miss the pup echos (third system verified that the file server was receiving and echoing the packets). Next, all echos from the file server were missed. Shortly after that, although the PUP.ECHOUSER function indicated that packets were being sent, the third system showed no packets being sent to the server. After that, the system becomes quite mute, and will not send anything. '
'
	This verifies some of the previous observations; in that when the communications fails, net monitors cannot pick-up any transmissions from the errant machine. PSW usually indicates problems establishing leaf connections; Lafite says "All Down"; then it appears that the whole net has gone down. We have other users (D-lions) on this 10M system, and as Kathy mentioned before, the Tech rep replaced the E-net driver boards.'
'
	Both of us have tried to use RESTART.ETHER, usually with disastrous results in that the system momentarily goes into Raid then disappears forever; never giving us enough time to read the error message.'
'
Doug McFarland'
'
--------'
Date': 21 Jun 84 12':07 PDT'
From': vanMelle.pa'
'
I have downgraded this to "Perhaps" (though maybe still too optimistic) in absence of any further info or diagnostic assistance.  Also changed category to Communications/Other, which is where I''m putting all ar''s whose subsystem would be "Low level ether" if we had such a category (NS Protocols is inappropriate--the problem is in the 10mb net driver, not the higher-level protocols).'
'


Workaround: 

Test Case: 

Edit-By: vanMelle

Edit-Date: 21-Jun-84 12':08':09