YggLockWatchdogImpl.mesa
Copyright Ó 1984, 1985, 1988 by Xerox Corporation. All rights reserved.
MBrown on January 30, 1984 5:41:50 pm PST
Kupfer, February 28, 1985 2:16:45 pm PST
Hauser, May 20, 1985 1:40:32 pm PDT
Bob Hagmann July 13, 1988 10:52:40 am PDT
NOTES
Some care has been taken to make these procedures efficient in the (normal) case that waiting happens rarely, the total number of waiting requests is small, and SkiPatrol logging is turned off for abortTransaction. This allows us to check more frequently, which improves response in cases when the wait can be resolved right away (deadlock and inactive blocking transaction).
DIRECTORY
YggEnvironment,
YggInternal,
BasicTime,
YggDIDMap USING [DocumentFromLockID, GetName],
List,
YggLock,
YggLockControl,
YggLockInternal,
Process,
Rope USING [Cat, Concat, ROPE],
YggMonitoringHooks USING [TransIDFromTransHandle, TransIDToRope],
YggMonitoringLog USING [notice, TransactionAbortInfo],
YggTransactionMap;
YggLockWatchdogImpl:
CEDAR PROGRAM
IMPORTS
BasicTime, YggDIDMap, List, YggLock, YggLockInternal, Process, Rope, YggMonitoringHooks, YggMonitoringLog, YggTransactionMap
EXPORTS
YggInternal, YggLockControl
= BEGIN
LockID: TYPE = YggLock.LockID;
nullLockID: LockID = YggLock.nullLockID;
LockMode: TYPE = YggLock.LockMode;
ModeReleasableSet: TYPE = YggLock.ModeReleasableSet;
Lock: TYPE = YggLockInternal.Lock;
LockRep: TYPE = YggLockInternal.LockRep;
Header: TYPE = YggLockInternal.Header;
Request: TYPE = YggLockInternal.Request;
GrantedRequest: TYPE = YggLockInternal.GrantedRequest;
WaitingRequest: TYPE = YggLockInternal.WaitingRequest;
LockTransHeader: TYPE = YggLockInternal.LockTransHeader;
ConflictingTranses: TYPE = RECORD[holder: YggTransactionMap.TransHandle, waiter: YggTransactionMap.TransHandle, lock: LockID]; -- typical information about a lock conflict
LockTransHeaderObject:
PUBLIC
TYPE = YggLockInternal.LockRep.request.transHeader;
YggInternal.LockTransHeaderObject
ForkWatchdogProcess:
PUBLIC
PROC [
wakeupPeriod: Process.Milliseconds,
abortWaitingRequestInterval: INT--seconds--,
abortInactiveGrantedRequestInterval: INT--seconds--] = {
YggLockControl.ForkWatchdogProcess.
TRUSTED {Process.Detach[FORK LockWatchdogProcess[wakeupPeriod, abortWaitingRequestInterval, abortInactiveGrantedRequestInterval]]; };
};
LockWatchdogProcess:
PROC [
wakeupPeriod: Process.Milliseconds,
abortWaitingRequestInterval: INT--seconds--,
abortInactiveGrantedRequestInterval: INT--seconds--] = {
wakeupPeriodsUntilTimeoutCheck: INT ← 0;
numWaits: INT ← 0;
p: Path ← NIL;
DO
Process.Pause[Process.MsecToTicks[wakeupPeriod] ! ABORTED => GOTO aborted];
For each waiting request, abort its transaction if it has waited for more than abortWaitingRequestInterval seconds. (Of course, don't bother to check for timed-out requests unless a request might be timed out, based on information obtained on the last check.)
IF (wakeupPeriodsUntilTimeoutCheck ← wakeupPeriodsUntilTimeoutCheck-1) <= 0
THEN {
l: LIST OF ConflictingTranses;
logProc: PROC [YggMonitoringLog.TransactionAbortInfo]; -- (used to stop race condition)
[wakeupPeriodsUntilTimeoutCheck, l] ← ChooseTimeoutVictims[wakeupPeriod, abortWaitingRequestInterval];
UNTIL l =
NIL
DO
lNext: LIST OF ConflictingTranses ← l.rest;
IF (logProc ← YggMonitoringLog.notice.abortTransaction) #
NIL
THEN {
fileName: Rope.ROPE; -- name of the file being fought over
message: Rope.ROPE;
message ←
IF l.first.holder #
NIL
THEN
Rope.Cat["transaction timed out on lock held by ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[l.first.holder]]]
ELSE
"transaction timed out; livelock?";
fileName ← YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[l.first.lock]];
message ← message.Cat[
"; waiting for ",
IF fileName =
NIL
THEN
"(NIL)"
];
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[l.first.waiter],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: message
]];
};
YggTransactionMap.AbortUnilaterally[l.first.waiter, timeout];
Process.Yield[];
FREE[@l];
l ← lNext;
ENDLOOP;
};
Collect a list of transactions that hold locks that other transactions need. For each blocking transaction: if it has done no work in the last abortInactiveGrantedRequestInterval seconds, abort it. Repeat until no transaction is aborted by this test.
Process.Yield[];
DO
l: LIST OF ConflictingTranses ← GetBlockingTransactions[];
currentTime: BasicTime.GMT ← BasicTime.Now[];
noAbortDone: BOOL ← TRUE;
UNTIL l =
NIL
DO
lNext: LIST OF ConflictingTranses ← l.rest;
elapsedTimeSinceLastStartWork: INT ← BasicTime.Period[from: YggTransactionMap.GetTimeOfLastStartWork[l.first.holder], to: currentTime];
IF elapsedTimeSinceLastStartWork > abortInactiveGrantedRequestInterval
THEN {
logProc: PROC [YggMonitoringLog.TransactionAbortInfo];
fileName: Rope.ROPE; -- name of file being fought over
IF (logProc ← YggMonitoringLog.notice.abortTransaction) #
NIL
THEN {
fileName ← YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[l.first.lock]];
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[l.first.holder],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: Rope.Cat[
"trans is holding up ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[l.first.waiter]],
"; waiting for ",
IF fileName = NIL
THEN
"(NIL)"
]
]]
};
YggTransactionMap.AbortUnilaterally[l.first.holder, blockingNewLockRequest];
Process.Yield[];
noAbortDone ← FALSE;
};
FREE[@l];
l ← lNext;
ENDLOOP;
IF noAbortDone THEN EXIT;
ENDLOOP;
Compute a graph whose vertices are transactions with one or more waiting lock requests, containing a directed edge from t1 to t2 if t2 holds a lock that prevents a waiting lock request or t1 from being satisfied. (Don't bother to compute the graph if no lock requests have waited since the last time the graph was computed.) If the graph contains a cycle, find a cycle and abort the lowest-cost transaction in the cycle. Repeat until the graph contains no cycle.
Process.Yield[];
{
g: WaitingForGraph;
[numWaits, g] ← ComputeWaitingForGraph[numWaits];
IF g #
NIL
THEN {
logProc: PROC [YggMonitoringLog.TransactionAbortInfo]; -- (used to stop race condition)
DO
t: YggTransactionMap.TransHandle; -- transaction to kill
p ← FindCycle[g, p];
IF (t ← ChooseVictim[p]) = NIL THEN GOTO freeGraph;
IF (logProc ← YggMonitoringLog.notice.abortTransaction) #
NIL
THEN {
Log the transaction ID's of everyone involved in the deadlock.
deadlockList: Rope.ROPE ← NIL;
FOR i:
NAT
IN [0 .. p.length)
DO
deadlockList ← deadlockList.Cat[
" ",
YggMonitoringHooks.TransIDToRope[YggMonitoringHooks.TransIDFromTransHandle[p[i].waitingTrans.trans]]
];
deadlockList ← deadlockList.Cat[
"(",
YggDIDMap.GetName[YggDIDMap.DocumentFromLockID[p[i].waitingTrans.waitingFor]],
")"
];
ENDLOOP;
logProc[[
transID: YggMonitoringHooks.TransIDFromTransHandle[t],
where: "LockWatchdogImpl.LockWatchdogProcess",
why: watchDog,
message: Rope.Concat["deadlock of", deadlockList]
]];
};
YggTransactionMap.AbortUnilaterally[t, deadlock];
Process.Yield[];
g ← EliminateTrans[g, t];
REPEAT
freeGraph => FreeGraph[g];
ENDLOOP;
};
};
ENDLOOP;
};
Support for timeout and blockingNewLockRequest detection.
WaitMemb:
PROC [trans: YggTransactionMap.TransHandle, list:
LIST
OF ConflictingTranses]
RETURNS [
BOOLEAN] ~ {
Tells whether "trans" appears as a "waiter" in "list".
UNTIL list =
NIL
DO
IF trans = list.first.waiter THEN RETURN [TRUE];
list ← list.rest
ENDLOOP;
RETURN [FALSE]
};
BlockMemb:
PROC [trans: YggTransactionMap.TransHandle, list:
LIST
OF ConflictingTranses]
RETURNS [
BOOLEAN] ~ {
Tells whether "trans" appears as a "holder" in "list".
UNTIL list =
NIL
DO
IF trans = list.first.holder THEN RETURN [TRUE];
list ← list.rest
ENDLOOP;
RETURN [FALSE]
};
ChooseTimeoutVictims:
PROC [wakeupPeriod:
INT
--msec--, abortWaitingRequestInterval:
INT
--seconds--]
RETURNS [wakeupPeriodsUntilTimeoutCheck:
INT, l:
LIST
OF ConflictingTranses] = {
currentTime: BasicTime.GMT = BasicTime.Now[];
secondsToNextWakeup: INT ← abortWaitingRequestInterval;
NoticeWaitingRequest:
PROC [wr: WaitingRequest]
RETURNS [stop:
BOOL] = {
INTERNAL to LockCoreImpl
m: LockMode = ModeAfterGrantingRequest[wr];
elapsedTimeSinceWait: INT--seconds-- = BasicTime.Period[from: wr.startTime, to: currentTime];
If "wr" is old and not already in our list of things to kill (l), then add it to the list. When we add it, we'd like to record a transaction ID for a transaction that is blocking it (there may be more than one). If we can't find a holder, record NIL as the holder (but kill the waiter anyway).
IF elapsedTimeSinceWait >= abortWaitingRequestInterval
THEN {
IF
NOT WaitMemb[wr.trans, l]
THEN {
wr is old and not already in the list, so find a holder
FOR h: Lock ← wr.requestList, h.requestList
UNTIL h=wr
DO
WITH h
SELECT
FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans
AND
NOT YggLock.Compat[m][grh.mode]
THEN {
l ← CONS[first: [holder: grh.trans, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
GOTO FoundABlocker
}
ENDCASE;
REPEAT
FoundABlocker => NULL;
FINISHED =>
l ← CONS[first: [holder: NIL, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
ENDLOOP;
}
}
ELSE {
wr isn't old; update time for next wakeup, if necessary
secondsToNextWakeup ← MIN[secondsToNextWakeup, abortWaitingRequestInterval-elapsedTimeSinceWait];
};
RETURN [stop: FALSE];
}; -- NoticeWaitingRequest
l ← NIL;
YggLockInternal.GetInfo[waitingRequestEnumProc: NoticeWaitingRequest];
RETURN [(secondsToNextWakeup*1000 + wakeupPeriod-1)/wakeupPeriod, l];
};
GetBlockingTransactions:
PROC []
RETURNS [l:
LIST
OF ConflictingTranses] = {
NoticeWaitingRequest:
PROC [wr: WaitingRequest]
RETURNS [stop:
BOOL] = {
INTERNAL to LockCoreImpl
m: LockMode = ModeAfterGrantingRequest[wr];
Look for transactions that are blocking wr. As one is found, add it to the list if it isn't already in it.
FOR h: Lock ← wr.requestList, h.requestList
UNTIL h=wr
DO
WITH h
SELECT
FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans
AND
NOT YggLock.Compat[m][grh.mode]
AND
NOT BlockMemb[grh.trans, l]
THEN
l ← CONS[first: [holder: grh.trans, waiter: wr.trans, lock: YggLockInternal.LockIDFromRH[wr]], rest: l];
ENDCASE;
ENDLOOP;
RETURN [stop: FALSE];
};
l ← NIL;
YggLockInternal.GetInfo[waitingRequestEnumProc: NoticeWaitingRequest];
RETURN [l];
};
ModeAfterGrantingRequest:
PROC [wr: WaitingRequest]
RETURNS [LockMode] = {
INTERNAL to LockCoreImpl
FOR h: Lock ← wr.requestList, h.requestList
UNTIL h=wr
DO
WITH h
SELECT
FROM
grh: GrantedRequest =>
IF grh.trans = wr.trans THEN RETURN[YggLock.Sup[wr.mode][grh.mode]];
ENDCASE;
ENDLOOP;
RETURN [wr.mode];
};
Support for deadlock detection.
WaitingTransHandle: TYPE = REF WaitingTransObject;
WaitingTransObject:
TYPE =
RECORD [
trans: YggTransactionMap.TransHandle,
waitingFor: LockID ← ,
edges: LIST OF--WaitingTransHandle--REF ANY,
visited: BOOL ← FALSE,
onPath: BOOL ← FALSE
];
WaitingForGraph: TYPE = LIST OF--WaitingTransHandle--REF ANY;
ComputeWaitingForGraph:
PROC [numWaits:
INT]
RETURNS [newNumWaits: INT, g: WaitingForGraph] = {
If numWaits equals the number of lock waits reported by the lock manager, return newNumWaits = numWaits, g = NIL. Otherwise compute the current waiting-for graph, and return it as g.
NoticeGeneralInfo: YggLockInternal.GeneralInfoProc = {
newNumWaits ← nSetCallsWaited;
};
NoticeWaitingRequest:
PROC [wr: WaitingRequest]
RETURNS [stop:
BOOL] = {
If there are new waiters, add this transaction to the list of waiting transactions.
IF newNumWaits = numWaits THEN RETURN [stop: TRUE]
ELSE {
IF (LookupTrans[g, wr.trans] =
NIL)
THEN {
wt: WaitingTransHandle ← NEW[WaitingTransObject ← [trans: wr.trans, edges: NIL, waitingFor: YggLockInternal.LockIDFromRH[wr]]];
g ← CONS[first: wt, rest: g];
};
RETURN [stop: FALSE];
};
};
NoticeWaitingRequest2:
PROC [wr: WaitingRequest]
RETURNS [stop:
BOOL] = {
If there are new waiters, create an edge for each transaction blocking this request.
IF newNumWaits = numWaits THEN RETURN [stop: TRUE]
ELSE {
waiting: WaitingTransHandle = LookupTrans[g, wr.trans];
m: LockMode = ModeAfterGrantingRequest[wr];
FOR h: Lock ← wr.requestList, h.requestList
UNTIL h = wr
DO
WITH h
SELECT
FROM
grh: GrantedRequest =>
IF grh.trans # wr.trans
AND
NOT YggLock.Compat[m][grh.mode]
THEN {
granted: WaitingTransHandle ← LookupTrans[g, grh.trans];
IF granted #
NIL
AND
NOT List.Memb[granted, waiting.edges]
THEN
waiting.edges ← CONS[first: granted, rest: waiting.edges];
};
ENDCASE;
ENDLOOP;
RETURN [stop: FALSE];
};
};
g ← NIL;
YggLockInternal.GetInfo[
generalInfoProc: NoticeGeneralInfo,
waitingRequestEnumProc: NoticeWaitingRequest,
waitingRequestEnumProc2: NoticeWaitingRequest2];
RETURN [newNumWaits, g];
};
PathRecord:
TYPE =
RECORD [
length: NAT ← 0,
vertices:
SEQUENCE maxLength:
NAT
OF
RECORD [
waitingTrans: WaitingTransHandle,
nextEdge: LIST OF--WaitingTransHandle--REF ANY
]
];
Path: TYPE = REF PathRecord;
FindCycle:
PROC [g: WaitingForGraph, oldPath: Path]
RETURNS [path: Path] = {
If g contains a cycle, returns a cycle as path; returns path.length = 0 if g contains no cycle. Reuses oldPath's storage if it can.
top: LIST OF REF ANY ← g;
pathLast: NAT ← 0;
v: WaitingTransHandle;
pathMaxLength: NAT = List.Length[g];
path ←
IF oldPath #
NIL
AND oldPath.length >= pathMaxLength
THEN oldPath
ELSE NEW[PathRecord[pathMaxLength]];
DO
{
-- EXITS NextTopLevelVertex
v ← NARROW[top.first];
IF v.visited THEN GOTO NextTopLevelVertex;
DO
Add the new vertex v to the end of the path.
path[pathLast] ← [waitingTrans: v, nextEdge: v.edges];
v.visited ← v.onPath ← TRUE;
DO
Explore the next edge out of the vertex at the end of the path.
e: LIST OF REF ANY ← path[pathLast].nextEdge;
IF e =
NIL
THEN {
Remove a vertex from the end of the path.
path[pathLast].waitingTrans.onPath ← FALSE;
IF pathLast = 0 THEN GOTO NextTopLevelVertex;
pathLast ← pathLast - 1;
}
ELSE {
path[pathLast].nextEdge ← e.rest;
v ← NARROW[e.first];
IF v.onPath THEN GOTO FoundCycle;
IF
NOT v.visited
THEN {
Add v to the end of the path.
pathLast ← pathLast + 1;
EXIT;
};
};
ENDLOOP;
ENDLOOP;
EXITS
NextTopLevelVertex =>
IF (top ← top.rest) =
NIL
THEN {
path.length ← 0;
GOTO done;
};
FoundCycle => {
Move vertices of cycle down to start of sequence, clean up rest, and return.
cycleStart: NAT;
FOR cycleStart ← 0, cycleStart+1
UNTIL path[cycleStart].waitingTrans = v DO ENDLOOP;
FOR j:
NAT
IN [cycleStart .. pathLast]
DO
path[j-cycleStart] ← [path[j].waitingTrans, NIL];
ENDLOOP;
path.length ← pathLast-cycleStart+1;
GOTO done;
};
};
ENDLOOP;
EXITS done =>
FOR j:
NAT
IN [path.length .. path.maxLength)
DO
path[j] ← [NIL, NIL];
ENDLOOP;
};
ChooseVictim:
PROC [p: Path]
RETURNS [YggTransactionMap.TransHandle] = {
Transactions in p are deadlocked. Choose one to be aborted.
Return NIL iff p is empty.
minUpdateCost: INT ← LAST[INT];
minUpdateCostTrans: YggTransactionMap.TransHandle ← NIL;
FOR i:
NAT
IN [0 .. p.length)
DO
updateCost: INT ← YggTransactionMap.GetEstimatedUpdateCost[p[i].waitingTrans.trans];
IF updateCost < minUpdateCost
OR minUpdateCostTrans =
NIL
THEN {
minUpdateCost ← updateCost;
minUpdateCostTrans ← p[i].waitingTrans.trans;
};
ENDLOOP;
RETURN [minUpdateCostTrans];
};
LookupTrans:
PROC [g: WaitingForGraph, t: YggTransactionMap.TransHandle]
RETURNS [wt: WaitingTransHandle] = {
FOR l:
LIST
OF
REF
ANY ← g, l.rest
UNTIL l =
NIL
DO
wt: WaitingTransHandle ← NARROW[l.first];
IF wt.trans = t THEN RETURN [wt];
ENDLOOP;
RETURN [NIL];
};
EliminateTrans:
PROC [g: WaitingForGraph, t: YggTransactionMap.TransHandle]
RETURNS [newG: WaitingForGraph]= {
Compute a new graph, eliminating t. Mark all vertices unvisited and not on path.
wt: WaitingTransHandle ← LookupTrans[g, t];
FreeList[wt.edges]; wt.edges ← NIL;
FOR l:
LIST
OF
REF
ANY ← g, l.rest
UNTIL l =
NIL
DO
w: WaitingTransHandle ← NARROW[l.first];
w.visited ← w.onPath ← FALSE;
w.edges ← List.DRemove[wt, w.edges];
ENDLOOP;
RETURN [List.DRemove[wt, g]];
};
FreeGraph:
PROC [g: WaitingForGraph] = {
NILs things out (superstition).
FOR l:
LIST
OF
REF
ANY ← g, l.rest
UNTIL l =
NIL
DO
wt: WaitingTransHandle ← NARROW[l.first];
FreeList[wt.edges];
FREE[@wt];
ENDLOOP;
FreeList[g];
};
FreeList:
PROC [l:
LIST
OF
REF
ANY] = {
UNTIL l =
NIL
DO
lNext: LIST OF REF ANY ← l.rest;
FREE[@l];
l ← lNext;
ENDLOOP;
};
END.--LockWatchdogImpl
CHANGE LOG
Changed by MBrown on February 6, 1983 10:03 pm
Bug in GetBlockingTransactions: always returned the waiting transaction instead of the blocking transaction. This meant wait => abort nearly every time.
Edited on July 24, 1984 11:01:18 am PDT, by Kupfer
Add YggMonitoringLog probes.
changes to: LockWatchdogImpl, LockWatchdogProcess
Edited on July 29, 1984 7:08:42 pm PDT, by Kupfer
Add code so that the lockConflict probes can report who the interested parties were.
changes to: LockWatchdogImpl, ChooseTimeoutVictims, NoticeWaitingRequest (local of ChooseTimeoutVictims)
Edited on August 6, 1984 3:31:53 pm PDT, by Kupfer
Remove the possible race condition in YggMonitoringLog probes by assigning the PROC to a temporary variable.
changes to: LockWatchdogImpl
Edited on February 28, 1985 11:35:48 am PST, by Kupfer
Record the name of the files being fought over when the watchdog has to kill some transaction.
changes to: LockWatchdogImpl, NoticeWaitingRequest (local of ChooseTimeoutVictims), NoticeWaitingRequest (local of GetBlockingTransactions), ComputeWaitingForGraph