Number: 1174

Date: 21-May-84 17':00':52

Submitter: Sannella.PA


Subject: Need way to scavenge/certify Dorado disk w/o using BFS certify

Lisp Version: 

Description: '
Date': 14 May 84 17':55 PDT'
Subject': [BURNS.PA': Hard disk errors]'
cc': vanMelle, Purcell, Sheil'
This deserves an AR, impact Serious, attn': Lisp.'
Date': Mon, 14 May 84 17':38 PDT'
From': BURNS.PA'
Subject': Hard disk errors'
To': Masinter'
cc': Kaehler, BURNS'
	As per our previous conversation, I am comitting this to writing.  With the advent of the AMS 315 disk drives on the Dorado, there seems to be a rash of hard disk errors.  The present method for getting rid of these errors is to save everything on that partition and then use "BFS certify" to mark out the spot in the bad page table.  This, I belive, is a very time consuming and painful method for just taking care of a bad spot.  Perhaps someone can gin up a routine that can be used to log out these errors without having to certify and rebuild the whole partition. It may also be usefull on the Dlyons as well.'
P.S.  I talked to Ted Kaehler about the same problem and he said he would raise the question in their group discussion.'
Date': 15 May 84 00':19 PST'
Subject': Creeping hard disk errors in new 315MByte Dorados'
To': LispUsers↑.pa, Raim.pasa'
cc': Burns, Willie-sue'
There seems to be a tendency for some of the new 315Mbyte Dorado disks to become flaky over a period of time. This happened to me on the KSA pool Dorado a few weeks back, and Willie-Sue Orr mentioned it again today. [Happily, the T-80 disks, after passing certification, almost never showed such a degradation over time; but then their density etc. is much simpler than that of the AMS 315''s.]	'
These hard disk errors will usually will be manifest as some bad spots in a few disk pages, and Lisp will at least inform you of the situation (when it can catch it) by breaking, or dropping into Raid.  If the bad spots are in the Lisp.VIRTUALMEM file (as happened with the KSA pool), you won''t be able to use Lisp to diagnose or recover.'
The recovery isn''t simple (sigh).  Currently, Brad Burns of PARC Technical Services suggests':'
  1) saving the accessible contents of the disk elsewhere;'
  2) using the "Certify" command of BFSTest (available from the'
     NetExec) which will mark the bad spots in the bad page table;'
  3) rebuilding the partition on which the bad spots were found.'
Troubles should be reported to'
[Marty': The situation could become critical for Dorados installed at customer sites.] '
Date': Wed, 16 May 84 13':37 PDT'
From': Evans.pasa'
Subject': Re': Creeping hard disk errors in new 315MByte Dorados'
In-reply-to': "''s message of 15 May 84 00':19 PST"'
cc': Raim,,'
Note that in the past the Xerox code for reading disks has not used all of the available ways that the Trident drives provide for error recovery.  You might consider trying some of these (+-Offset, read early/late) to get to a place that has "gone bad" to recover the data.'


Test Case: 

Edit-By: Sannella.PA

Edit-Date:  1-Jun-84 12':49':31

Attn: Lisp

Assigned To: 



System: Operating System

Subsystem: Dolphin/Dorado Disk



Microcode Version: 

Memory Size: 

File Server: 

Server Software Version: 

Difficulty: Very Hard

Frequency: Everytime

Impact: Serious

Priority: Hopefully

Status: Open

Problem Type: Performance

Source Files: