Number: 2112

Date:  7-Sep-84 16':29':49

Submitter: Dering.pasa

Source: Schoen

Subject: List bound to stack variable gets smashed after GC

Assigned To: JonL.pa

Attn: Release

Status: Fixed

In/By: Harmony

Problem Type: Bug

Impact: Fatal

Difficulty: 

Frequency: Intermittent

Priority: Absolutely

System: Language Support

Subsystem: Storage Formats/Mgt

Machine: 1108

Disk: 

Lisp Version: 21-Jun-84 10':50':28

Source Files: 

Microcode Version: 5124

Memory Size: 7168

File Server: 

Server Software Version: 

Disposition: The "BigRefcnt" mechanism wasn''t settting stackp bit in the refcnt cell; I fixed in the sources, and prepared a patch file for Carol release on [Eris]<Lisp>Carol>Patches>BigRefcntPatch -- JonL 9/10/84'
["JonL.pa" "10-Sep-84 22':53':44" Assigned% To': Attn': Status':(New->Fixed) In/By': Disposition':]'
["Sannella.PA" "11-Sep-84 10':15':08" Description':]'
["vanMelle" "14-Sep-84 15':31':44" Attn':]

Description:  Schoen of SchlErDR reports the following':'
Return-Path': <SCHOEN@SUMEX-AIM.ARPA>'
Received': from SUMEX-AIM.ARPA by Xerox.ARPA ; 06 SEP 84 13':51':16 PDT'
Date': Thu, 6 Sep 84 13':50':00 PDT'
From': Eric Schoen <Schoen@SUMEX-AIM.ARPA>'
Subject': Critical bug encountered in Carol'
To': 1100support.pasa'
'
One of our researchers here has written an application in Lisp which fails nondeterministically under Carol, on either a Dolphin or DTiger (haven''t tried it on a Dorado yet).  The problem is caused by a list bound to a  stack variable getting changed IMMEDIATELY after (e.g. during) a garbage collection.  Furthermore, if I exam the smashed variable right after the'
break window pops up (usually as the result of a SETA in which the array argument, derived from the variable in question, is bad), I get some value. If I then pop up a stack backtrace and select a frame to get a stack variable inspector window, the value of the smashed variable changes.  If I  didn;t know any better, I''d think the variable were pointing into some portion '
of a CONS cell free list.'
'
and more specifically':'
Return-Path': <SCHOEN@SUMEX-AIM.ARPA>'
Received': from SUMEX-AIM.ARPA by Xerox.ARPA ; 07 SEP 84 07':06':31 PDT'
Date': Fri, 7 Sep 84 07':06':26 PDT'
From': Eric Schoen <Schoen@SUMEX-AIM.ARPA>'
Subject': Follow-up, critical Carol bug reported yesterday'
To': 1100support.pasa'
'
Yesterday, I reported a Carol bug in which lists were getting smashed out from under me.  I have now narrowed the problem down and can reproduceit with this simple function':'
'
(LAMBDA NIL'
   (PROG ((FOO (for X in ''(A) collect (CONS X (ARRAY 10 ''FLOATP))))'
	  TEMP)'
	 (until (OR (NEQ (CAAR FOO) ''A)'
		    (NOT (ARRAYP (SETQ TEMP (CDR (FASSOC ''A FOO))))))'
		do (APPEND (for X in ''(B C D) collect (CONS X TEMP))'
			FOO)'
		finally (RETURN FOO))))'
'
This function should never return; however, immediately after a GC, it does, and FOO is returned as garbage.  I suspect an unfortunate interaction with stack reference counting and APPEND.  The first three entries of the list':'
'
	((B . {ARRAYP}xx,xxxx)(C . {ARRAYP}xx,xxxx)(D . {ARRAYP}xx,xxxx)'
			      (A . {ARRAYP}xx,xxxx))'
'
are collectable immediately following the APPEND.  The last entry is not, since FOO points to it.  Is the stack scanning reference counter missing this binding?'
'
I would consider this an excruciatingly serious bug, since it produces quite unpredictable behavior in the most innocuous Lisp code.'
'
-----'
'
Date': 10 Sep 84 23':09 PDT'
From': JonL.pa'
Subject': AR 2112 -- Critical Bug encountered in Carol'
To': Dering.pasa,LispSupport'
cc': vanMelle, Masinter, Schoen@SUMEX'
'
I''ve just fixed this one, and am starting a new <LispCore>Next> loadup now to incorporate (previous loadup today seems to have failed?).  There is a patch for Carol sysout now on [Eris]<Lisp>Carol>Patches>BigRefcntPatch.'
'
Problem was simply that the "big" refcnt branch of the GCHashTable operation was failing to set and unset the stackp bit.   There was a comment in the code to the effect that it wasn''t necessary -- evidently thinking that the "big" count would be enough to prevent reclamation; but as this case shows, recursive freeing can certainly decrement a count all the way from "big" levels.'
'
Kudos to Eric for tracking it down to so narrow a test case!'
'
-- JonL --'
'
'


Workaround: 

Test Case: 

Edit-By: vanMelle

Edit-Date: 14-Sep-84 15':31':44