YggLockWatchdogImpl.mesa
Copyright Ó 1984, 1985, 1988 by Xerox Corporation. All rights reserved.
MBrown on January 30, 1984 5:41:50 pm PST
Kupfer, February 28, 1985 2:16:45 pm PST
Hauser, May 20, 1985 1:40:32 pm PDT
Bob Hagmann May 3, 1988 2:32:47 pm PDT
NOTES
Some care has been taken to make these procedures efficient in the (normal) case that waiting happens rarely, the total number of waiting requests is small, and SkiPatrol logging is turned off for abortTransaction. This allows us to check more frequently, which improves response in cases when the wait can be resolved right away (deadlock and inactive blocking transaction).
DIRECTORY
YggEnvironment,
YggInternal,
BasicTime,
YggDIDMap USING [DocumentFromLockID, GetName],
List,
YggLock,
YggLockControl,
YggLockInternal,
YggDummyProcess,
Rope USING [Cat, Concat, ROPE],
YggMonitoringHooks USING [TransIDFromTransHandle, TransIDToRope],
YggMonitoringLog USING [notice, TransactionAbortInfo],
YggTransactionMap;
YggLockWatchdogImpl: CEDAR PROGRAM
IMPORTS
BasicTime, YggDIDMap, List, YggLock, YggLockInternal, YggDummyProcess, Rope, YggMonitoringHooks, YggMonitoringLog, YggTransactionMap
EXPORTS
YggInternal, YggLockControl
= BEGIN
LockID: TYPE = YggLock.LockID;
nullLockID: LockID = YggLock.nullLockID;
LockMode: TYPE = YggLock.LockMode;
ModeReleasableSet: TYPE = YggLock.ModeReleasableSet;
Lock: TYPE = YggLockInternal.Lock;
LockRep: TYPE = YggLockInternal.LockRep;
Header: TYPE = YggLockInternal.Header;
Request: TYPE = YggLockInternal.Request;
GrantedRequest: TYPE = YggLockInternal.GrantedRequest;
WaitingRequest: TYPE = YggLockInternal.WaitingRequest;
LockTransHeader: TYPE = YggLockInternal.LockTransHeader;
ConflictingTranses: TYPE = RECORD[holder: YggTransactionMap.TransHandle, waiter: YggTransactionMap.TransHandle, lock: LockID]; -- typical information about a lock conflict
LockTransHeaderObject: PUBLIC TYPE = YggLockInternal.LockRep.request.transHeader;
YggInternal.LockTransHeaderObject
ForkWatchdogProcess: PUBLIC PROC [
wakeupPeriod: YggDummyProcess.Milliseconds,
abortWaitingRequestInterval: INT--seconds--,
abortInactiveGrantedRequestInterval: INT--seconds--] = {
YggLockControl.ForkWatchdogProcess.
TRUSTED {YggDummyProcess.Detach[FORK LockWatchdogProcess[wakeupPeriod, abortWaitingRequestInterval, abortInactiveGrantedRequestInterval]]; };
};
LockWatchdogProcess: PROC [
wakeupPeriod: YggDummyProcess.Milliseconds,
abortWaitingRequestInterval: INT--seconds--,
abortInactiveGrantedRequestInterval: INT--seconds--] = {
wakeupPeriodsUntilTimeoutCheck: INT ← 0;
numWaits: INT ← 0;
p: Path ← NIL;
DO
YggDummyProcess.Pause[YggDummyProcess.MsecToTicks[wakeupPeriod] ! ABORTED => GOTO aborted];
For each waiting request, abort its transaction if it has waited for more than abortWaitingRequestInterval seconds. (Of course, don't bother to check for timed-out requests unless a request might be timed out, based on information obtained on the last check.)
IF (wakeupPeriodsUntilTimeoutCheck ← wakeupPeriodsUntilTimeoutCheck-1) <= 0 THEN {
l: LIST OF ConflictingTranses;
logProc: PROC [YggMonitoringLog.TransactionAbortInfo]; -- (used to stop race condition)
[wakeupPeriodsUntilTimeoutCheck, l] ← ChooseTimeoutVictims[wakeupPeriod, abortWaitingRequestInterval];
UNTIL l = NIL DO
lNext: LIST OF ConflictingTranses ← l.rest;
IF (logProc ← YggMonitoringLog.notice.abortTransaction) # NIL THEN {
fileName: Rope.ROPE;  -- name of the file being fought over
message: Rope.ROPE;
message ← IF l.first.holder # NIL THEN
Rope.Cat["transaction timed out on lock held by ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[l.first.holder]]]
ELSE
"transaction timed out; livelock?";
fileName ← YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[l.first.lock]];
message ← message.Cat[
"; waiting for ",
IF fileName = NIL THEN
"(NIL)"
ELSE
fileName
];
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[l.first.waiter],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: message
]];
};
YggTransactionMap.AbortUnilaterally[l.first.waiter, timeout];
YggDummyProcess.Yield[];
FREE[@l];
l ← lNext;
ENDLOOP;
};
Collect a list of transactions that hold locks that other transactions need. For each blocking transaction: if it has done no work in the last abortInactiveGrantedRequestInterval seconds, abort it. Repeat until no transaction is aborted by this test.
YggDummyProcess.Yield[];
DO
l: LIST OF ConflictingTranses ← GetBlockingTransactions[];
currentTime: BasicTime.GMT ← BasicTime.Now[];
noAbortDone: BOOLTRUE;
UNTIL l = NIL DO
lNext: LIST OF ConflictingTranses ← l.rest;
elapsedTimeSinceLastStartWork: INT ← BasicTime.Period[from: YggTransactionMap.GetTimeOfLastStartWork[l.first.holder], to: currentTime];
IF elapsedTimeSinceLastStartWork > abortInactiveGrantedRequestInterval THEN {
logProc: PROC [YggMonitoringLog.TransactionAbortInfo];
fileName: Rope.ROPE;  -- name of file being fought over
IF (logProc ← YggMonitoringLog.notice.abortTransaction) # NIL THEN {
fileName ← YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[l.first.lock]];
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[l.first.holder],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: Rope.Cat[
"trans is holding up ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[l.first.waiter]],
"; waiting for ",
IF fileName = NIL THEN
"(NIL)"
ELSE
fileName
]
]]
};
YggTransactionMap.AbortUnilaterally[l.first.holder, blockingNewLockRequest];
YggDummyProcess.Yield[];
noAbortDone ← FALSE;
};
FREE[@l];
l ← lNext;
ENDLOOP;
IF noAbortDone THEN EXIT;
ENDLOOP;
Compute a graph whose vertices are transactions with one or more waiting lock requests, containing a directed edge from t1 to t2 if t2 holds a lock that prevents a waiting lock request or t1 from being satisfied. (Don't bother to compute the graph if no lock requests have waited since the last time the graph was computed.) If the graph contains a cycle, find a cycle and abort the lowest-cost transaction in the cycle. Repeat until the graph contains no cycle.
YggDummyProcess.Yield[];
{
g: WaitingForGraph;
[numWaits, g] ← ComputeWaitingForGraph[numWaits];
IF g # NIL THEN {
logProc: PROC [YggMonitoringLog.TransactionAbortInfo]; -- (used to stop race condition)
DO
t: YggTransactionMap.TransHandle; -- transaction to kill
p ← FindCycle[g, p];
IF (t ← ChooseVictim[p]) = NIL THEN GOTO freeGraph;
IF (logProc ← YggMonitoringLog.notice.abortTransaction) # NIL THEN {
Log the transaction ID's of everyone involved in the deadlock.
deadlockList: Rope.ROPENIL;
FOR i: NAT IN [0 .. p.length) DO
deadlockList ← deadlockList.Cat[
" ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[p[i].waitingTrans.trans]]
];
deadlockList ← deadlockList.Cat[
"(",
YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[p[i].waitingTrans.waitingFor]],
")"
];
ENDLOOP;
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[t],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: Rope.Concat["deadlock of", deadlockList]
]];
};
YggTransactionMap.AbortUnilaterally[t, deadlock];
YggDummyProcess.Yield[];
g ← EliminateTrans[g, t];
REPEAT
freeGraph => FreeGraph[g];
ENDLOOP;
};
};
ENDLOOP;
EXITS
aborted => NULL
};
Support for timeout and blockingNewLockRequest detection.
WaitMemb: PROC [trans: YggTransactionMap.TransHandle, list: LIST OF ConflictingTranses] RETURNS [BOOLEAN] ~ {
Tells whether "trans" appears as a "waiter" in "list".
UNTIL list = NIL DO
IF trans = list.first.waiter THEN RETURN [TRUE];
list ← list.rest
ENDLOOP;
RETURN [FALSE]
};
BlockMemb: PROC [trans: YggTransactionMap.TransHandle, list: LIST OF ConflictingTranses] RETURNS [BOOLEAN] ~ {
Tells whether "trans" appears as a "holder" in "list".
UNTIL list = NIL DO
IF trans = list.first.holder THEN RETURN [TRUE];
list ← list.rest
ENDLOOP;
RETURN [FALSE]
};
ChooseTimeoutVictims: PROC [wakeupPeriod: INT--msec--, abortWaitingRequestInterval: INT--seconds--] RETURNS [wakeupPeriodsUntilTimeoutCheck: INT, l: LIST OF ConflictingTranses] = {
currentTime: BasicTime.GMT = BasicTime.Now[];
secondsToNextWakeup: INT ← abortWaitingRequestInterval;
NoticeWaitingRequest: PROC [wr: WaitingRequest] RETURNS [stop: BOOL] = {
INTERNAL to LockCoreImpl
m: LockMode = ModeAfterGrantingRequest[wr];
elapsedTimeSinceWait: INT--seconds-- = BasicTime.Period[from: wr.startTime, to: currentTime];
If "wr" is old and not already in our list of things to kill (l), then add it to the list. When we add it, we'd like to record a transaction ID for a transaction that is blocking it (there may be more than one). If we can't find a holder, record NIL as the holder (but kill the waiter anyway).
IF elapsedTimeSinceWait >= abortWaitingRequestInterval THEN {
IF NOT WaitMemb[wr.trans, l] THEN {
wr is old and not already in the list, so find a holder
FOR h: Lock ← wr.requestList, h.requestList UNTIL h=wr DO
WITH h SELECT FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans AND NOT YggLock.Compat[m][grh.mode] THEN {
l ← CONS[first: [holder: grh.trans, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
GOTO FoundABlocker
}
ENDCASE;
REPEAT
FoundABlocker => NULL;
FINISHED =>
l ← CONS[first: [holder: NIL, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
ENDLOOP;
}
}
ELSE {
wr isn't old; update time for next wakeup, if necessary
secondsToNextWakeup ← MIN[secondsToNextWakeup, abortWaitingRequestInterval-elapsedTimeSinceWait];
};
RETURN [stop: FALSE];
}; -- NoticeWaitingRequest
l ← NIL;
YggLockInternal.GetInfo[waitingRequestEnumProc: NoticeWaitingRequest];
RETURN [(secondsToNextWakeup*1000 + wakeupPeriod-1)/wakeupPeriod, l];
};
GetBlockingTransactions: PROC [] RETURNS [l: LIST OF ConflictingTranses] = {
NoticeWaitingRequest: PROC [wr: WaitingRequest] RETURNS [stop: BOOL] = {
INTERNAL to LockCoreImpl
m: LockMode = ModeAfterGrantingRequest[wr];
Look for transactions that are blocking wr. As one is found, add it to the list if it isn't already in it.
FOR h: Lock ← wr.requestList, h.requestList UNTIL h=wr DO
WITH h SELECT FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans AND NOT YggLock.Compat[m][grh.mode] AND NOT BlockMemb[grh.trans, l] THEN
l ← CONS[first: [holder: grh.trans, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
ENDCASE;
ENDLOOP;
RETURN [stop: FALSE];
};
l ← NIL;
YggLockInternal.GetInfo[waitingRequestEnumProc: NoticeWaitingRequest];
RETURN [l];
};
ModeAfterGrantingRequest: PROC [wr: WaitingRequest] RETURNS [LockMode] = {
INTERNAL to LockCoreImpl
FOR h: Lock ← wr.requestList, h.requestList UNTIL h=wr DO
WITH h SELECT FROM
grh: GrantedRequest =>
IF grh.trans = wr.trans THEN RETURN[YggLock.Sup[wr.mode][grh.mode]];
ENDCASE;
ENDLOOP;
RETURN [wr.mode];
};
Support for deadlock detection.
WaitingTransHandle: TYPE = REF WaitingTransObject;
WaitingTransObject: TYPE = RECORD [
trans: YggTransactionMap.TransHandle,
waitingFor: LockID ← ,
edges: LIST OF--WaitingTransHandle--REF ANY,
visited: BOOLFALSE,
onPath: BOOLFALSE
];
WaitingForGraph: TYPE = LIST OF--WaitingTransHandle--REF ANY;
ComputeWaitingForGraph: PROC [numWaits: INT]
RETURNS [newNumWaits: INT, g: WaitingForGraph] = {
If numWaits equals the number of lock waits reported by the lock manager, return newNumWaits = numWaits, g = NIL. Otherwise compute the current waiting-for graph, and return it as g.
NoticeGeneralInfo: YggLockInternal.GeneralInfoProc = {
newNumWaits ← nSetCallsWaited;
};
NoticeWaitingRequest: PROC [wr: WaitingRequest] RETURNS [stop: BOOL] = {
If there are new waiters, add this transaction to the list of waiting transactions.
IF newNumWaits = numWaits THEN RETURN [stop: TRUE]
ELSE {
IF (LookupTrans[g, wr.trans] = NIL) THEN {
wt: WaitingTransHandle ← NEW[WaitingTransObject ← [trans: wr.trans, edges: NIL, waitingFor: YggLockInternal.LockIDFromRH[wr]]];
g ← CONS[first: wt, rest: g];
};
RETURN [stop: FALSE];
};
};
NoticeWaitingRequest2: PROC [wr: WaitingRequest] RETURNS [stop: BOOL] = {
If there are new waiters, create an edge for each transaction blocking this request.
IF newNumWaits = numWaits THEN RETURN [stop: TRUE]
ELSE {
waiting: WaitingTransHandle = LookupTrans[g, wr.trans];
m: LockMode = ModeAfterGrantingRequest[wr];
FOR h: Lock ← wr.requestList, h.requestList UNTIL h = wr DO
WITH h SELECT FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans AND NOT YggLock.Compat[m][grh.mode] THEN {
granted: WaitingTransHandle ← LookupTrans[g, grh.trans];
IF granted # NIL AND NOT List.Memb[granted, waiting.edges] THEN
waiting.edges ← CONS[first: granted, rest: waiting.edges];
};
ENDCASE;
ENDLOOP;
RETURN [stop: FALSE];
};
};
g ← NIL;
YggLockInternal.GetInfo[
generalInfoProc: NoticeGeneralInfo,
waitingRequestEnumProc: NoticeWaitingRequest,
waitingRequestEnumProc2: NoticeWaitingRequest2];
RETURN [newNumWaits, g];
};
PathRecord: TYPE = RECORD [
length: NAT ← 0,
vertices: SEQUENCE maxLength: NAT OF RECORD [
waitingTrans: WaitingTransHandle,
nextEdge: LIST OF--WaitingTransHandle--REF ANY
]
];
Path: TYPE = REF PathRecord;
FindCycle: PROC [g: WaitingForGraph, oldPath: Path] RETURNS [path: Path] = {
If g contains a cycle, returns a cycle as path; returns path.length = 0 if g contains no cycle. Reuses oldPath's storage if it can.
top: LIST OF REF ANY ← g;
pathLast: NAT ← 0;
v: WaitingTransHandle;
pathMaxLength: NAT = List.Length[g];
path ← IF oldPath # NIL AND oldPath.length >= pathMaxLength
THEN oldPath
ELSE NEW[PathRecord[pathMaxLength]];
DO
{ -- EXITS NextTopLevelVertex
v ← NARROW[top.first];
IF v.visited THEN GOTO NextTopLevelVertex;
DO
Add the new vertex v to the end of the path.
path[pathLast] ← [waitingTrans: v, nextEdge: v.edges];
v.visited ← v.onPath ← TRUE;
DO
Explore the next edge out of the vertex at the end of the path.
e: LIST OF REF ANY ← path[pathLast].nextEdge;
IF e = NIL THEN {
Remove a vertex from the end of the path.
path[pathLast].waitingTrans.onPath ← FALSE;
IF pathLast = 0 THEN GOTO NextTopLevelVertex;
pathLast ← pathLast - 1;
}
ELSE {
path[pathLast].nextEdge ← e.rest;
v ← NARROW[e.first];
IF v.onPath THEN GOTO FoundCycle;
IF NOT v.visited THEN {
Add v to the end of the path.
pathLast ← pathLast + 1;
EXIT;
};
};
ENDLOOP;
ENDLOOP;
EXITS
NextTopLevelVertex => IF (top ← top.rest) = NIL THEN {
path.length ← 0;
GOTO done;
};
FoundCycle => {
Move vertices of cycle down to start of sequence, clean up rest, and return.
cycleStart: NAT;
FOR cycleStart ← 0, cycleStart+1
UNTIL path[cycleStart].waitingTrans = v DO ENDLOOP;
FOR j: NAT IN [cycleStart .. pathLast] DO
path[j-cycleStart] ← [path[j].waitingTrans, NIL];
ENDLOOP;
path.length ← pathLast-cycleStart+1;
GOTO done;
};
};
ENDLOOP;
EXITS done =>
FOR j: NAT IN [path.length .. path.maxLength) DO
path[j] ← [NIL, NIL];
ENDLOOP;
};
ChooseVictim: PROC [p: Path] RETURNS [YggTransactionMap.TransHandle] = {
Transactions in p are deadlocked. Choose one to be aborted.
Return NIL iff p is empty.
minUpdateCost: INTLAST[INT];
minUpdateCostTrans: YggTransactionMap.TransHandle ← NIL;
FOR i: NAT IN [0 .. p.length) DO
updateCost: INT ← YggTransactionMap.GetEstimatedUpdateCost[p[i].waitingTrans.trans];
IF updateCost < minUpdateCost OR minUpdateCostTrans = NIL THEN {
minUpdateCost ← updateCost;
minUpdateCostTrans ← p[i].waitingTrans.trans;
};
ENDLOOP;
RETURN [minUpdateCostTrans];
};
LookupTrans: PROC [g: WaitingForGraph, t: YggTransactionMap.TransHandle]
RETURNS [wt: WaitingTransHandle] = {
FOR l: LIST OF REF ANY ← g, l.rest UNTIL l = NIL DO
wt: WaitingTransHandle ← NARROW[l.first];
IF wt.trans = t THEN RETURN [wt];
ENDLOOP;
RETURN [NIL];
};
EliminateTrans: PROC [g: WaitingForGraph, t: YggTransactionMap.TransHandle]
RETURNS [newG: WaitingForGraph]= {
Compute a new graph, eliminating t. Mark all vertices unvisited and not on path.
wt: WaitingTransHandle ← LookupTrans[g, t];
FreeList[wt.edges]; wt.edges ← NIL;
FOR l: LIST OF REF ANY ← g, l.rest UNTIL l = NIL DO
w: WaitingTransHandle ← NARROW[l.first];
w.visited ← w.onPath ← FALSE;
w.edges ← List.DRemove[wt, w.edges];
ENDLOOP;
RETURN [List.DRemove[wt, g]];
};
FreeGraph: PROC [g: WaitingForGraph] = {
NILs things out (superstition).
FOR l: LIST OF REF ANY ← g, l.rest UNTIL l = NIL DO
wt: WaitingTransHandle ← NARROW[l.first];
FreeList[wt.edges];
FREE[@wt];
ENDLOOP;
FreeList[g];
};
FreeList: PROC [l: LIST OF REF ANY] = {
UNTIL l = NIL DO
lNext: LIST OF REF ANY ← l.rest;
FREE[@l];
l ← lNext;
ENDLOOP;
};
END.--LockWatchdogImpl
CHANGE LOG
Changed by MBrown on February 6, 1983 10:03 pm
Bug in GetBlockingTransactions: always returned the waiting transaction instead of the blocking transaction. This meant wait => abort nearly every time.
Edited on July 24, 1984 11:01:18 am PDT, by Kupfer
Add YggMonitoringLog probes.
changes to: LockWatchdogImpl, LockWatchdogProcess
Edited on July 29, 1984 7:08:42 pm PDT, by Kupfer
Add code so that the lockConflict probes can report who the interested parties were.
changes to: LockWatchdogImpl, ChooseTimeoutVictims, NoticeWaitingRequest (local of ChooseTimeoutVictims)
Edited on August 6, 1984 3:31:53 pm PDT, by Kupfer
Remove the possible race condition in YggMonitoringLog probes by assigning the PROC to a temporary variable.
changes to: LockWatchdogImpl
Edited on February 28, 1985 11:35:48 am PST, by Kupfer
Record the name of the files being fought over when the watchdog has to kill some transaction.
changes to: LockWatchdogImpl, NoticeWaitingRequest (local of ChooseTimeoutVictims), NoticeWaitingRequest (local of GetBlockingTransactions), ComputeWaitingForGraph