Number: 

Date: 22-Aug-84 19':54':15

Submitter: JonL.pa

Source: JonL.pa

Subject: DLIONFS does something wrong during MKDIR, which causes DLionDeath on startup

Assigned To: Stansbury

Attn: Release

Status: Fixed

In/By: 

Problem Type: Bug

Impact: Fatal

Difficulty: 

Frequency: Intermittent

Priority: Absolutely

System: Operating System

Subsystem: DLion Disk

Machine: 1100

Disk: 

Lisp Version: 22-Aug-84 05':00':44

Source Files: 

Microcode Version: 5121

Memory Size: 3584

File Server: 

Server Software Version: 

Disposition: '
["Sannella.PA" "23-Aug-84 11':44':07" Status':(New->Open) Priority':(->Hopefully)]'
["Sannella.PA" "28-Aug-84 10':22':51" Subject': Priority':(Hopefully->Absolutely) Description':]'
Date': 11 Sep 84 15':14 PDT'
From': Stansbury.pa'
Subject': AR 1931': DLIONFS does something wrong during MKDIR, which causes DLionDeath on startup'
To': JonL, Lispsupport, Lichtenberg.wbst'
cc': Stansbury.pa'
'
There are actually 3 bugs in this ar, all fixed in the next release of the file system':'
'
1.  The file system is trying to look for a page way off the end of the disk, and this seems to be happening on startup.  Solution': (a) Have rewritten part of that code, and now cannot get the problem to recur, so I must have fixed in rewriting.  (b)  Have put in checks (which can be compiled out) which will ensure that ridiculous disk seeks will be caught in a stack environment in which they can be debugged.'
'
2.  When the file system has (for whatever reason) lost its handle on the lisp directory, it can die trying to do a forgetpages on that nonexistent directory handle.  Solution':  (a) Have eliminated most of the things that can cause that handle to be lost, and (b) have put in code to check that the handle exists before doing a forgetpages.'
'
3.  Wanted a way of changing the type of a volume from pilot to nonpilot and vice versa without attempting to purge or create directories.  Solution': There is now a function '
(CHANGEVOLUMETYPE volumeName type),'
where type is one of PILOT or LISP, which will do that.'
'
-- Tayloe.'
["Stansbury" "11-Sep-84 15':18':33" Assigned% To': Attn': Status':(Open->Fixed) Disposition':]

Description: '
Date': 22 Aug 84 02':54 PDT'
From': JonL.pa'
Subject': Lisp': DLionDeath under \DL.DISKSEEK'
To': LispSupport.pa'
cc': Sannella,Lichtenberg.wbst,vanMelle,JonL.pa'
'
Lisp System Date': 15-Aug-84 20':42':17'
Machine': Dandelion (25200051507)'
Microcode version': 24,1'
Memory size': 5777'
Frequency': Always'
Impact': Serious'
'
After InstallLisping from Tajo, with /dsx switches, the Lisp will churn around a while doing some of the initialization, and die with illegal.arg to \PUTBASE.UFN from under \DL.DISKSEEK.   '
'
Looking up the stack, from \DL.DISKSEEK through \DL.SHUGART.XFERDISK, the cylinder argument is 2557506, with isn''t a legal value for \PUTBASE (too big).  Perhaps the trouble originates in \PvTransferPageNoSwap, which has an absoluteDiskAddress argument of 327360835.'
'
Taking Mike''s suggestion, I went into Hello, and erase''d the DSK volume; but this had no effect on the problem.  However, a ↑D at the ensuing TeleRaid seems to put things back on course (albeit without a legit DSK volume).'
'
-- JonL --'
'
-----'
Date': Wed, 22 Aug 84 14':19 EDT'
From': Lichtenberg.WBST'
Subject': Re': Lisp': DLionDeath under \DL.DISKSEEK'
In-reply-to': "JonL.pa''s message of 22 Aug 84 02':54 PDT"'
To': JonL.pa'
cc': LispSupport.pa, Sannella.pa, Lichtenberg, vanMelle.pa'
'
JonL,'
'
Yes.. This problem is from DLIONFS, and it usually occurs when the disk''s state'
is smashed somehow.'
'
Have you noticed recently that Othello (&Hello) let you erase a non-pilot volume?  Well... it apparently leaves the volume in its non-pilot state, and zaps'
the Lisp directory that once stood there.  Part of DLionfs''s initialization is to look for a Lisp directory (which it [unfortunately] ASSUMES is there if the volume is non-Pilot).  DLionfs does not avoid trashy volumes very well.  Try ↑Ding a few times from Teleraid (it should eventually let you in).  Set \DFSInitialized to T, and MAKEPILOT(DSK) the disk.  It will probably crash again, but this time the volume will have been converted.  Now get a fresh lisp (or reboot the one that crashed) and try MKDIRing it.'
'
I guess Tayloe should work on a better way of dealing with such things, and I should make a more graceful error exit for DLionfs disk calls.'
'
/Mitch.'
'
-----'
'
Date': 22 Aug 84 19':39 PDT'
From': JonL.pa'
Subject': Re': Lisp': DLionDeath under \DL.DISKSEEK'
In-reply-to': Lichtenberg.WBST''s message of Wed, 22 Aug 84 14':19 EDT'
To': Lichtenberg.WBST'
cc': JonL.pa, LispSupport.pa, Sannella.pa, vanMelle.pa'
'
Thanks for the helpful info, Mitch, I''ve got my DLion back on the air now.  But still, the DESCribe command of Hello lists the DSK volume as non-Pilot.  Is that right, after the call to MAKEPILOT and MKDIR?'
'
Incidentally, in order to get MAKEPILOT to run at all, I had to neuter the call to FORGETPAGES within MAKEPILOT; e.g.'
CHANGENAME(MAKEPILOT FORGETPAGES NILL)'
because of a related bug in DLIONFS, about which I''ll submit a bug report and/or an AR.'
'
-- JonL --'
'
'
Upon startup,  the \DFSEventFn function is called as AFTERLOGOUT, and this in turn tries to call FORGETPAGES; unfortunately, it fetches  the directory stream from  the FDEV and feeds that to FORGETPAGES without noticeing that the "stream" is NIL.  This causes a 9305 under FORGETPAGES.    A similar problem exists in MAKEPILOT -- using Mitch''s suggestion, I kludged it so that I could call MAKEPILOT, even though the DSK volume was thoroughly trashed;  but MAKEPILOT contains a line'
    (FORGETPAGES (GetDirectory vol))'
and in this case (GetDirectory vol) winds up NIL, leading to 9305 death again.'
'
-----'
'
Date': 23 Aug 84 19':13 PDT'
From': Masinter.pa'
Subject': Lisp': DLionFS crash in latest loadups'
To': LispSupport.pa'
'
Lisp System Date': 23-Aug-84 18':52':49'
Machine': Dorado (Plaza)'
Microcode version': 24,4'
Memory size': 10000'
Frequency': Always'
'
'
This is consistent':'
'
I erased the disk on the pool DLion, and configured it.'
'
I did a MKDIR(DSK) to create a {DSK} volume.'
'
I did a LOGOUT(T)'
'
I started a fresh sysout'
'
The machine dies in a 9318 under \DL.DISKSEEK. The '
absoluteDiskAddress is -1843405693, which is trying to seek to CYL 3 5404'
'
\PvTransferPage was called with the same absoluteDiskAddress, as was \LvGetPage'
'
I think there should be an internal consistency check putinto the disk system to check for numbers out of range like this.'
'
'
-----'
'
Date': 24 Aug 84 12':43 EDT'
From': lichtenberg.wbst'
Subject': Re': Lisp': DLionDeath under \DL.DISKSEEK'
In-reply-to': JonL.pa''s message of 22 Aug 84 19':39 PDT'
To': JonL.pa'
cc': Lichtenberg.WBST, LispSupport.pa, Sannella.pa, vanMelle.pa'
'
If you Described the volume after the MAKEPILOT and before the MKDIR, you would have (should have) seen "normal" as the type.  That''s what MAKEPILOT does - changes the type field of the volume.  A better MAKEPILOT is in order (at least) - one that does only one thing': Change the type of the volume.... Better yet - a Hello feature!'
'
/Mitch.'


Workaround: 

Test Case: 

Edit-By: Stansbury

Edit-Date: 11-Sep-84 15':18':36