VMStateImpl.mesa
last edited by Levin on December 1, 1983 10:36 am
DIRECTORY
PrincOps USING [flagsNone, flagsDirty, flagsVacant, InterimPageState, PageState, Port],
PrincOpsUtils
USING [
DisableInterrupts, EnableInterrupts, GetReturnLink, IsVacant, PageStateFromFlags, SetPageState, SetReturnLink],
VM USING [nullInterval],
VMBacking USING [BriefPageState],
VMInternal
USING [
AddToFreeList, AllocCount, CleanOutcome, Crash, DataState, GetVMMap, InitializeTables, InOut, Interval, maxPinCount, Outcome, PageCount, PageNumber, PageState, RealPageNumber, RMMapEntry, RMMapEntries, RMEntryPointer, SetVMMap, SwapInOutcome, Victim, VMMapEntry, VMPartition],
VMSideDoor USING [],
VMStatistics USING [];
VMStateImpl:
MONITOR
LOCKS vmStateLock
IMPORTS PrincOpsUtils, VMInternal
EXPORTS VMBacking, VMInternal, VMSideDoor, VMStatistics
SHARES VMInternal =
BEGIN
OPEN VMInternal;
Notes on the implementation:
The representation of the mapping from virtual page number to its associated PageState is partitioned into two pieces. A portion called the VM map is used for virtual memory pages that do not have an associated real memory page. A portion called the RM map is used for virtual memory pages that have an associated real memory page. (The RM map also holds other information, as will be discussed shortly.) Thus, to locate the PageState for a given (virtual) page, one first consults the hardware PageMap. If the flags say "vacant", the PageState is stored in the VM map; otherwise, it is stored in the RM map.
The representation of information in the VM map can be considerably optimized. First, PageState.hasRealMemory is implicitly FALSE and PageState.pinCount isn't meaningful. Thus, only PageState.readOnly and PageState.dataState need to be explicitly represented, requiring 3 bits per page. Second, we can use a trick that relies on the hardware implementation of the PageMap on all of the machines of interest (Dolphin, Dorado, Dandelion). Although the PrincOps specifies that if both dirty and write-protected are TRUE (indicating "vacant"), the real page number is undefined, the hardware does not enforce this restriction and is capable of retaining a few bits of information (<= 12) in what would otherwise be the real page number. This is simply because the map is implemented as a table lookup in a memory at least 16 bits wide of which at most 4 are used for flags. The VM map can therefore be comfortably stored in the available memory of the PageMap.
If RMEntry.rmState is "reclaimable", the real memory is available for reclamation by the real memory allocator (if it does so, it uses the "virtual" field to set the PageMap entry to "vacant"). If RMEntry.rmState is "pinned", the real memory allocator is prohibited from reclaiming this page. Note that the RMMap does not encode the virtual memory location of pinned real memory. This is not necessary, since the operation that changes rmState from "pinned" to "reclaimable" (Unpin) supplies the necessary PageNumber. Furthermore, this representation permits the reservation of specific real memory without necessarily associating it with virtual memory, a convenience in dealing with the Dandelion's display memory. If the needsBackingStoreWrite Boolean is TRUE, the bits in this page of real memory are known not to be present on backing storage. The "or" of needsBackingStoreWrite and the PageMap "dirty" bit is used to decide if the page must be written out as part of reclaiming it (see the AllocateForSwapIn and ConsiderCleaning procedures, below). The dataState field in an RMMap entry should never be "none", but may be "undefined" (implying "killed"). Finally, if RMEntry.rmState is "free", the "next" field is used to chain unallocated real memory pages together. This provides the real memory allocator with a list of prime candidates for allocation.
All of the procedures in this interface share a monitor that protects the RM map, VM map, and PageMap. AllocateRealMemoryInternal needs to be available to the frame fault handler (via AllocateForLocalFrames) and therefore must be a coroutine (AllocateRealMemoryInternal) with its two callers, AllocateForLocalFrames and AllocateForSwapIn, entering the monitor to ensure mutual exclusion. To avoid possible recursion in frame fault handling, AllocateRealMemoryInternal must not call any procedures. To avoid deadlock, therefore, all other procedures that enter the monitor must not call procedures either.
A note on checkout/checkin:
This is not a general purpose facility. It is intended to permit parellelism between the VM operations that do not require I/O and those that do (SwapIn, Clean). Thus, a page can only be checked out when its hardware map entry says "vacant" (the map is always in this state during swapper I/O). The procedures below that implement non-I/O VM operations all test if the page in question is checked out and, if so, wait until it is checked in (releasing the monitor while they wait). Checkout occurs only as a part of AllocateForSwapIn and ConsiderCleaning, and checkin occurs only as a part of RelocateAfterSwapIn and MarkClean.
The following monitor lock is used to protect the data structures of this module. It is exported so that the inline procedures in the VMInternal interface (which logically belong to this implementation module) can use it.
vmStateLock: PUBLIC MONITORLOCK;
The following global variables are protected by vmStateLock.
freeList: PUBLIC RealPageNumber;
freePages: PUBLIC INT ← 0;
rmMap: PUBLIC LONG POINTER TO RMMapEntries;
lastRealPage:
PUBLIC RealPageNumber;
index of last entry in rmMap
allocationRover: RealPageNumber ← RealPageNumber.
FIRST;
rover for AllocateRealMemoryInternal
cleaningRover: RealPageNumber ← RealPageNumber.
FIRST;
rover for GetCleaningCandidate
checkIn: CONDITION ← [timeout: 0];
Exports to VMStatistics
rmReclamations: PUBLIC INT ← 0;
rmFreeList, rmOldClean, rmNewClean, rmDirty: PUBLIC INT ← 0;
rmAllocPasses, rmCleanPasses: PUBLIC INT ← 0;
readOnlyPages: PUBLIC INT ← 0;
pinnedPages, trappedPages: PUBLIC INT ← 0;
checkoutConflicts: PUBLIC INT ← 0;
VirtualAllocation:
PUBLIC
PROC [partition: VMPartition]
RETURNS [pagesAllocated, pagesFreed, pagesInPartition: PageCount] = {
RETURN[
pagesAllocated: allocCounts[partition].pagesAllocated,
pagesFreed: allocCounts[partition].pagesFreed,
pagesInPartition: partitions[partition].count
]
};
Exports to VMSideDoor
AssignSpecialRealMemory:
PUBLIC
SAFE
PROC [interval: Interval] =
TRUSTED {
ERROR -- not yet implemented
};
ReleaseSpecialRealMemory:
PUBLIC SAFE PROC [interval: Interval] =
TRUSTED {
ERROR -- not yet implemented
};
Exports to VMBacking
StateFromMapEntry:
PUBLIC
SAFE
PROC [map: PrincOps.InterimPageState]
RETURNS [VMBacking.BriefPageState] = TRUSTED {
The following works even if we are presented with a non-vacant map entry, but Andrew assures me that we won't!
vmEntry: VMMapEntry ← [
state: PrincOpsUtils.PageStateFromFlags[map.flags],
body: in[real: map.realPage]
];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out =>
RETURN[
SELECT vmE.dataState
FROM
none => free,
undefined => killed,
ENDCASE => active
];
in => RETURN[active];
ENDCASE => ERROR;
};
RecoverRealMemory:
PUBLIC
PROC = {
See comment in the interface for a description of this crock. It completely trashes the virtual memory abstraction, except for those pages that already have real memory.
vmPage: PageNumber ← 0;
THROUGH [0..freePages)
DO
realPage: RealPageNumber = freeList;
UNTIL PrincOpsUtils.IsVacant[vmPage] DO vmPage ← vmPage.SUCC; ENDLOOP;
PrincOpsUtils.SetPageState[
vmPage, realPage, PrincOpsUtils.PageStateFromFlags[PrincOps.flagsNone]];
vmPage ← vmPage.SUCC;
WITH rmMap[realPage]
SELECT
FROM
rmE: free RMMapEntry => freeList ← rmE.next;
ENDCASE => Crash[];
ENDLOOP;
};
Exports to VMInternal
Virtual Memory Allocation
partitions: PUBLIC ARRAY VMPartition OF Interval;
--*stats*-- allocCounts: PUBLIC ARRAY VMPartition OF AllocCount ← ALL[[0, 0, 0, 0]];
Real Memory Allocation
AllocateRealMemoryInternal:
PUBLIC PROC [
vmPage: PageNumber, dirtyVictimOK: BOOL ← TRUE, pin: BOOL ← FALSE]
RETURNS [victim: Victim] ← LOOPHOLE[@AwaitAllocateRealMemoryInternal];
This procedure allocates real memory and tentatively assigns it to the specified vmPage. Actually, this is one half of the coroutine linkage whose other half is AwaitAllocateRealMemoryInternal. The actual algorithm is in InitializeAllocateRealMemoryInternal.
Unpin:
PUBLIC ENTRY
PROC [vmPage: PageNumber]
RETURNS [outcome: Outcome ← ok] = {
vmEntry: VMMapEntry = GetCheckedInVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
in => {
rmE: RMEntryPointer = @rmMap[vmE.real];
WITH rmE: rmE
SELECT
FROM
free => Crash[];
reclaimable => NULL;
pinned =>
Perhaps we should complain if rmE.pinReason ~= normal?
IF rmE.pinReason = normal
THEN
SELECT rmE.pinCount
FROM
0 => Crash[];
1 => {
rmMap[vmE.real].body ← reclaimable[virtual: vmPage];
--*stats*-- pinnedPages ← pinnedPages.PRED;
};
ENDCASE =>
rmE.pinCount ← rmE.pinCount - 1;
ENDCASE;
};
out => IF vmE.dataState = none THEN outcome ← addressFault;
ENDCASE;
};
State Examination and Modification
State:
PUBLIC ENTRY
PROC [vmPage: PageNumber]
RETURNS [state: PageState] = {
It's OK to call GetVMMap instead of GetCheckedInVMMap, since we have the monitor lock and the data structures won't change under us.
vmEntry: VMMapEntry = GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => state ← [
dataState: vmE.dataState,
readOnly: vmE.readOnly,
hasRealMemory: FALSE,
needsCleaning: FALSE,
pinCount: 0
];
in => {
rmE: RMEntryPointer = @rmMap[vmE.real];
state ← [
dataState: IF vmE.state.flags.dirty THEN changed ELSE rmE.dataState,
readOnly: vmE.state.flags.readonly,
hasRealMemory: TRUE,
needsCleaning: vmE.state.flags.dirty OR rmE.needsBackingStoreWrite,
pinCount: WITH rmE^ SELECT FROM pinned => pinCount, ENDCASE => 0
];
};
ENDCASE;
};
SetDataState:
PUBLIC ENTRY
PROC [vmPage: PageNumber, dataState: DataState]
RETURNS [outcome: Outcome ← ok] = {
Note: If this procedure were made a coroutine (with an appropriate INLINE entry procedure), it could be invoked directly from the frame fault handler's "FlushLargeFrames" logic. Otherwise, that logic must reside in a separate process, which is allowed to call procedures and therefore take frame faults.
vmEntry: VMMapEntry ← GetCheckedInVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out =>
IF vmE.dataState = none THEN outcome ← addressFault
ELSE {vmE.dataState ← dataState; SetVMMap[vmPage, vmE]};
in => {
rmE: RMEntryPointer = @rmMap[vmE.real];
SELECT dataState
FROM
none => {
IF rmE.rmState = free THEN Crash[];
SetVMMap[vmPage,
[state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsVacant],
body: out[checkedOut: FALSE, readOnly: FALSE, dataState: dataState]]
];
AddToFreeList[vmE.real];
};
undefined => {
WITH rmE: rmE
SELECT
FROM
free => Crash[];
reclaimable => {
SetVMMap[vmPage,
[state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsVacant],
body: out[checkedOut: FALSE, readOnly: FALSE, dataState: dataState]]
];
AddToFreeList[vmE.real];
};
pinned => {
Strictly speaking, we should disable interrupts, reread the map, clear the dirty bit, and set the map before enabling interrupts. This will avoid losing the referenced bit. However, since the page is being killed, the referenced bit isn't very interesting now anyway.
vmE.state.flags.dirty ← rmE.needsBackingStoreWrite ← FALSE;
SetVMMap[vmPage, vmE];
rmE.dataState ← dataState;
};
ENDCASE;
};
unchanged => {
newEntry: VMMapEntry;
We must do the following atomically to avoid losing the referenced bit.
PrincOpsUtils.DisableInterrupts[];
newEntry ← GetVMMap[vmPage];
WITH newE: newEntry
SELECT InOut[newEntry]
FROM
in =>
IF newE.state.flags.dirty
THEN {
rmE.needsBackingStoreWrite ← TRUE;
newE.state.flags.dirty ← FALSE;
SetVMMap[vmPage, newE];
};
out => Crash[];
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
rmE.dataState ← dataState;
};
changed => rmE.dataState ← dataState;
ENDCASE;
};
ENDCASE;
};
ReadOnly <==> ReadWrite
MakeReadOnly:
PUBLIC ENTRY PROC [vmPage: PageNumber]
RETURNS [outcome: Outcome ← ok] = {
Interrupts must be disabled if the map entry says present, so that the test for "dirty" and subsequent resetting of "dirty" are atomic. Interrupts need not be disabled if the map entry says "vacant", but it doesn't hurt, since we spend minimal time in that state and it simplifies the code.
vmEntry: VMMapEntry ← GetCheckedInVMMap[vmPage];
PrincOpsUtils.DisableInterrupts[];
vmEntry ← GetVMMap[vmPage]; -- reread map to ensure atomicity
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out =>
IF vmE.dataState = none THEN outcome ← addressFault
ELSE
IF ~vmE.readOnly
THEN {
vmE.readOnly ← TRUE;
SetVMMap[vmPage, vmE];
--*stats*-- readOnlyPages ← readOnlyPages.SUCC;
};
in =>
IF ~vmE.state.flags.readonly
THEN {
IF vmE.state.flags.dirty
THEN {
rmMap[vmE.real].needsBackingStoreWrite ← TRUE;
vmE.state.flags.dirty ← FALSE;
rmMap[vmE.real].dataState ← changed;
};
vmE.state.flags.readonly ← TRUE;
SetVMMap[vmPage, vmE];
--*stats*-- readOnlyPages ← readOnlyPages.SUCC;
};
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
};
MakeReadWrite:
PUBLIC ENTRY PROC [vmPage: PageNumber]
RETURNS [outcome: Outcome ← ok] = {
Interrupts must be disabled if the map entry says present, so that the resetting of "readonly" is atomic (otherwise, the "referenced" bit might be lost). Interrupts need not be disabled if the map entry says "vacant", but it doesn't hurt, since we spend minimal time in that state and it simplifies the code.
vmEntry: VMMapEntry ← GetCheckedInVMMap[vmPage];
PrincOpsUtils.DisableInterrupts[];
vmEntry ← GetVMMap[vmPage]; -- reread map to ensure atomicity
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out =>
IF vmE.dataState = none THEN outcome ← addressFault
ELSE
IF vmE.readOnly
THEN {
vmE.readOnly ← FALSE;
SetVMMap[vmPage, vmE];
--*stats*-- readOnlyPages ← readOnlyPages.PRED;
};
in =>
IF vmE.state.flags.readonly
THEN {
vmE.state.flags.readonly ← FALSE;
SetVMMap[vmPage, vmE];
--*stats*-- readOnlyPages ← readOnlyPages.PRED;
};
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
};
Utilities for VM.SwapIn and VM.Clean
AllocateForSwapIn:
PUBLIC ENTRY PROC [vmPage: PageNumber, kill, pin:
BOOL]
RETURNS [outcome: SwapInOutcome, victim: Victim] = {
This is a specialized procedure for VM.SwapIn.
vmEntry: VMMapEntry ← GetCheckedInVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => {
IF vmE.dataState = none THEN RETURN [outcome: addressFault, victim: NULL];
IF kill
THEN
IF vmE.readOnly THEN RETURN [outcome: writeFault, victim: NULL]
ELSE vmE.dataState ← undefined;
outcome ← IF vmE.dataState = undefined THEN noReadNecessary ELSE needsRead;
vmE.checkedOut ← TRUE;
SetVMMap[vmPage, vmE];
victim ← AllocateRealMemoryInternal[vmPage: vmPage, pin: pin];
Note: the rmMap entry has now been updated for eventual association (by SwapInDone) with vmPage.
};
in => {
rmE: RMEntryPointer = @rmMap[vmE.real];
outcome ← alreadyIn;
IF kill
THEN {
Strictly speaking, we should disable interrupts, reread the map, clear the dirty bit, and set the map before enabling interrupts. This will avoid losing the referenced bit. However, since the page is being killed, the referenced bit isn't very interesting now anyway.
IF vmE.state.flags.readonly THEN RETURN [outcome: writeFault, victim: NULL];
vmE.state.flags.dirty ← rmE.needsBackingStoreWrite ← FALSE;
SetVMMap[vmPage, vmE];
rmE.dataState ← undefined;
};
IF pin
THEN
WITH rmE: rmE
SELECT
FROM
free => Crash[];
reclaimable => {
rmMap[vmE.real].body ← pinned[pinCount: 1];
--*stats*-- pinnedPages ← pinnedPages.SUCC;
};
pinned =>
IF rmE.pinCount < maxPinCount THEN rmE.pinCount ← rmE.pinCount + 1
ELSE Crash[];
ENDCASE;
};
ENDCASE;
};
SwapInDone:
PUBLIC
ENTRY
PROC [vmPage, bufferPage: PageNumber, worked:
BOOL] = {
This is a specialized procedure for VM.SwapIn. "bufferPage" should be a page whose VMMapEntry is "in" and whose real page matches the one returned for the corresponding "vmPage" by AllocateForSwapIn, above.
vmEntry: VMMapEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => {
bufferEntry: VMMapEntry ← GetVMMap[bufferPage];
WITH bE: bufferEntry
SELECT InOut[bufferEntry]
FROM
in => {
rmE: RMEntryPointer = @rmMap[bE.real];
IF worked
THEN {
--*stats*--
SELECT rmE.rmState
FROM
free => Crash[];
pinned => pinnedPages ← pinnedPages.SUCC;
reclaimable => NULL;
ENDCASE;
bE.state ← PrincOpsUtils.PageStateFromFlags[
[readonly: vmE.readOnly, dirty: FALSE, referenced: TRUE]];
SetVMMap[vmPage, bE];
}
ELSE {
IF rmE.rmState = free THEN Crash[];
AddToFreeList[bE.real];
vmE.checkedOut ← FALSE;
SetVMMap[vmPage, vmEntry];
};
In either case above, 'vmPage' has been checked back in, so...
BROADCAST checkIn;
Now we make the swap buffer page vacant.
vmE.checkedOut ← vmE.readOnly ← FALSE;
vmE.dataState ← undefined;
SetVMMap[bufferPage, vmE];
};
out => Crash[];
ENDCASE;
};
in => Crash[];
ENDCASE;
};
SwapInDoneWithoutIO:
PUBLIC
ENTRY
PROC [vmPage: PageNumber, victim: Victim] = {
vmEntry: VMMapEntry = GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => {
SetVMMap[vmPage, [
state: PrincOpsUtils.PageStateFromFlags[
[readonly: vmE.readOnly, dirty: FALSE, referenced: TRUE]],
body: in[victim.realPage]
]];
This SetVMMap implicitly cleared the "checkedOut" bit, so...
BROADCAST checkIn;
--*stats*--
SELECT rmMap[victim.realPage].rmState
FROM
free => Crash[];
pinned => pinnedPages ← pinnedPages.SUCC;
reclaimable => NULL;
ENDCASE;
};
in => Crash[];
ENDCASE;
};
VictimWriteDone:
PUBLIC
ENTRY
PROC [
vmPage, bufferPage: PageNumber, victim: dirty Victim, worked: BOOL] = {
vmEntry: VMMapEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => {
IF ~worked
THEN {
Restore victim's map entry, but pin the page (permanently), since it can't be written out. The only way anyone will discover this happened is by looking at the counter in VMStatistics.
victimEntry: VMMapEntry = [
state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsDirty],
body: in[victim.realPage]];
rmE: RMEntryPointer = @rmMap[victim.realPage];
IF rmE.rmState = free THEN Crash[];
rmE^ ← [
dataState: vmE.dataState, needsBackingStoreWrite: TRUE,
body: pinned[pinReason: cantBeWritten, pinCount: 1]];
SetVMMap[victim.vmPage, victimEntry];
--*stats*-- pinnedPages ← pinnedPages.SUCC;
--*stats*-- trappedPages ← trappedPages.SUCC;
Check the original page back in.
vmE.checkedOut ← FALSE;
SetVMMap[vmPage, vmE];
BROADCAST checkIn;
};
Make the swap buffer page vacant.
vmE.readOnly ← vmE.checkedOut ← FALSE;
vmE.dataState ← undefined;
SetVMMap[bufferPage, vmE];
};
in => Crash[];
ENDCASE;
};
ConsiderCleaning:
PUBLIC ENTRY PROC [vmPage: PageNumber, checkOutClean:
BOOL]
RETURNS [outcome: CleanOutcome, real: RealPageNumber] = {
This is a specialized procedure for VM.Clean.
Interrupts must be disabled if the map entry says "present", so that the test for "dirty" and subsequent setting of "vacant" are atomic. Interrupts need not be disabled if the map entry says "vacant", but it doesn't hurt, since we spend minimal time in that state and it simplifies the code.
vmEntry: VMMapEntry;
PrincOpsUtils.DisableInterrupts[];
vmEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => outcome ← IF vmE.dataState = none THEN addressFault ELSE cantWrite;
in => {
rmE: RMEntryPointer = @rmMap[real ← vmE.real];
WITH rmE: rmE
SELECT
FROM
free => Crash[];
reclaimable => {
dirty: BOOL = vmE.state.flags.dirty OR rmE.needsBackingStoreWrite;
IF dirty
OR checkOutClean
THEN {
newEntry: VMMapEntry = [
state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsVacant],
body: out[
checkedOut: TRUE,
readOnly: vmE.state.flags.readonly,
dataState: IF vmE.state.flags.dirty THEN changed ELSE rmE.dataState
]];
rmE.referenced ← vmE.state.flags.referenced; -- save until CleanDone
SetVMMap[vmPage, newEntry];
};
outcome ←
SELECT
TRUE
FROM
dirty => needsWrite,
checkOutClean => checkedOutClean,
ENDCASE => cantWrite;
};
pinned => outcome ← cantWrite;
ENDCASE;
};
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
};
CleanDone:
PUBLIC
ENTRY
PROC [vmPage, bufferPage: PageNumber, worked:
BOOL] = {
This is a specialized procedure for VM.Clean. "bufferPage" should be a page whose VMMapEntry is "in" and whose real page matches the one returned for the corresponding "vmPage" by ConsiderCleaning, above.
vmEntry: VMMapEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => {
bufferEntry: VMMapEntry ← GetVMMap[bufferPage];
WITH bE: bufferEntry
SELECT InOut[bufferEntry]
FROM
in => {
rmEntry: RMEntryPointer = @rmMap[bE.real];
WITH rmE: rmEntry
SELECT
FROM
reclaimable => {
newEntry: VMMapEntry = [
state: PrincOpsUtils.PageStateFromFlags[
[readonly: vmE.readOnly, dirty: FALSE, referenced: rmE.referenced]],
body: in[real: bE.real]
];
IF rmE.virtual ~= vmPage THEN Crash[];
IF worked
THEN {
rmE.dataState ← vmE.dataState; -- computed by ConsiderCleaning
rmE.needsBackingStoreWrite ← FALSE;
}
ELSE {
rmEntry^ ← [
dataState: vmE.dataState, needsBackingStoreWrite: TRUE,
body: pinned[pinReason: cantBeWritten, pinCount: 1]];
--*stats*-- pinnedPages ← pinnedPages.SUCC;
--*stats*-- trappedPages ← trappedPages.SUCC;
};
SetVMMap[vmPage, newEntry];
The above SetVMMap implicitly cleared the "checkedOut" bit, so we must wake up any potential waiters.
BROADCAST checkIn;
};
free, pinned => Crash[];
ENDCASE;
};
out => Crash[];
ENDCASE;
vmE.checkedOut ← FALSE;
vmE.dataState ← undefined;
SetVMMap[bufferPage, vmE];
};
in => Crash[];
ENDCASE;
};
Aging
Age:
PUBLIC ENTRY PROC [vmPage: PageNumber]
RETURNS [outcome: Outcome ← ok] = {
Interrupts must be disabled if the map entry says present, so that the resetting of "referenced" is atomic. Interrupts need not be disabled if the map entry says "vacant", but it doesn't hurt, since we spend minimal time in that state and it simplifies the code.
This is a specialized procedure for VM.Age.
vmEntry: VMMapEntry;
PrincOpsUtils.DisableInterrupts[];
vmEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
out => IF vmE.dataState = none THEN outcome ← addressFault;
in => AgeInternal[vmPage, vmE];
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
};
Laundry process support
GetCleaningCandidate:
PUBLIC
ENTRY
PROC [
desired: PageCount, comfortLevel: PageCount, tryHard: BOOL ← FALSE]
RETURNS [interval: Interval ← VM.nullInterval] = {
We could probably get away without claiming the monitor lock.
firstPass: BOOL ← TRUE;
ExpandAroundVP:
PROC [vP: PageNumber]
RETURNS [interval: Interval] =
INLINE {
This procedure constructs an interval surrounding it whose 'dirty' and 'referenced' states are the same as 'vP'. It implicitly uses 'desired' and 'firstPass' and updates 'cleaningRover'.
lowerLimit: PageNumber = (IF vP < desired THEN 0 ELSE vP - desired).SUCC;
p: PageNumber ← vP;
interval.page ← vP;
UNTIL interval.page = lowerLimit
DO
vmPage: PageNumber = interval.page.PRED;
vmEntry: VMMapEntry = GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
in =>
WITH rmMap[vmE.real]
SELECT
FROM
rmE: reclaimable RMMapEntry =>
IF rmE.virtual ~= vmPage
OR
-- in swap buffer
~(vmE.state.flags.dirty OR rmE.needsBackingStoreWrite) OR
(vmE.state.flags.referenced AND firstPass) THEN EXIT;
rmE: pinned RMMapEntry => EXIT;
rmE: free RMMapEntry => Crash[];
ENDCASE;
out => EXIT;
ENDCASE;
interval.page ← vmPage;
ENDLOOP;
UNTIL (p - interval.page).
SUCC = desired
OR p = lastRealPage
DO
vmPage: PageNumber = p.SUCC;
vmEntry: VMMapEntry = GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
in =>
WITH rmMap[vmE.real]
SELECT
FROM
rmE: reclaimable RMMapEntry =>
IF rmE.virtual ~= vmPage
OR
-- in swap buffer
~(vmE.state.flags.dirty OR rmE.needsBackingStoreWrite) OR
(vmE.state.flags.referenced AND firstPass) THEN EXIT;
rmE: pinned RMMapEntry => EXIT;
rmE: free RMMapEntry => Crash[];
ENDCASE;
out => EXIT;
ENDCASE;
p ← vmPage;
ENDLOOP;
interval.count ← (p - interval.page).SUCC;
};
cleanPages: PageCount ← 0;
current: RealPageNumber ← cleaningRover;
recentDirtyFound: BOOL ← FALSE;
DO
IF current = RealPageNumber.
FIRST
THEN {
current ← lastRealPage;
--*stats*-- rmCleanPasses ← rmCleanPasses.SUCC;
}
ELSE current ← current.PRED;
SELECT
TRUE
FROM
current = cleaningRover =>
We have completed a scan of real memory.
IF firstPass
AND recentDirtyFound
AND tryHard
THEN {
firstPass ← FALSE;
cleanPages ← 0;
}
ELSE EXIT;
~tryHard AND cleanPages >= comfortLevel => EXIT;
ENDCASE;
WITH rmMap[current]
SELECT
FROM
rmE: reclaimable RMMapEntry => {
vP: PageNumber = rmE.virtual;
vmEntry: VMMapEntry = GetVMMap[vP];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
in => {
refed: BOOL = vmE.state.flags.referenced;
dirty: BOOL = vmE.state.flags.dirty OR rmE.needsBackingStoreWrite;
IF vmE.real ~= current THEN Crash[];
IF dirty
THEN
IF refed AND firstPass THEN recentDirtyFound ← TRUE
ELSE {interval ← ExpandAroundVP[vP]; EXIT}
ELSE cleanPages ← cleanPages.SUCC;
};
out => NULL; -- probably being swapped in
ENDCASE;
};
ENDCASE => NULL;
ENDLOOP;
cleaningRover ← current;
};
INTERNAL procedures
AwaitAllocateRealMemoryInternal:
PORT [victim: Victim]
RETURNS [vmPage: PageNumber, dirtyVictimOK, pin: BOOL];
InitializeAllocateRealMemoryInternal:
PROC RETURNS [victim: Victim] = {
LOOPHOLE[AwaitAllocateRealMemoryInternal, PrincOps.Port].dest ← PrincOpsUtils.GetReturnLink[];
DO
vmPage: PageNumber;
dirtyVictimOK, pin: BOOL;
[vmPage, dirtyVictimOK, pin] ← AwaitAllocateRealMemoryInternal[victim];
The following is to permit the debugger to trace the stack.
PrincOpsUtils.SetReturnLink[LOOPHOLE[AwaitAllocateRealMemoryInternal, PrincOps.Port].dest];
BEGIN
This block is the body of the actual allocation algorithm.
targetVMEntry: VMMapEntry ← GetVMMap[vmPage];
WITH tVM: targetVMEntry
SELECT InOut[targetVMEntry]
FROM
out => {
IF freePages > 0
THEN {
There is something on the free list.
victim ← [realPage: freeList, body: clean[]];
WITH rmMap[freeList]
SELECT
FROM
rmE: free RMMapEntry => {freeList ← rmE.next; freePages ← freePages.PRED};
ENDCASE => Crash[]; -- free list trashed
--*stats*-- rmFreeList ← rmFreeList.SUCC;
}
ELSE {
The free list is empty; reclamation is necessary.
current: RealPageNumber ← allocationRover;
firstPass: BOOL ← TRUE;
secondPassWorthWhile: BOOL ← FALSE;
victimP: PageNumber;
dirtyVictimState: {none, unreferenced, referenced} ← none;
dirtyVictimRP: RealPageNumber;
dirtyVictimDataState: DataState;
DO
IF current = RealPageNumber.
FIRST
THEN {
current ← lastRealPage;
--*stats*-- rmAllocPasses ← rmAllocPasses.SUCC;
}
ELSE current ← current.PRED;
IF current = allocationRover
THEN
-- a pass has completed
SELECT
TRUE
FROM
firstPass AND secondPassWorthWhile => firstPass ← FALSE;
dirtyVictimOK =>
IF dirtyVictimState ~= none
THEN {
--*stats*-- rmDirty ← rmDirty.SUCC;
WITH rmMap[dirtyVictimRP]
SELECT
FROM
rmE: reclaimable RMMapEntry => {
victimP ← rmE.virtual;
victim ← [realPage: dirtyVictimRP, body: dirty[vmPage: victimP]];
SetVMMap[victimP, [
state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsVacant],
body: out[
checkedOut: FALSE,
readOnly: GetVMMap[victimP].state.flags.readonly,
dataState: dirtyVictimDataState]]
];
};
ENDCASE => Crash[]; -- dirtyVictimRP improperly set
EXIT
}
ELSE Crash[]; -- all memory pinned or in transit
ENDCASE => Crash[]; -- frame fault and no unpinned clean memory
WITH rmMap[current]
SELECT
FROM
rmE: free RMMapEntry => Crash[]; -- the free list is supposed to be empty
rmE: reclaimable RMMapEntry => {
victimE: VMMapEntry;
We must disable interrupts to make sure that the dirty bit doesn't get lost during the following machinations.
PrincOpsUtils.DisableInterrupts[];
victimE ← GetVMMap[victimP ← rmE.virtual];
A reclaimable rmMap entry will have an "out VMMapEntry" if it has previously been allocated by this procedure but I/O is still in progress at a higher level. (It will also be checked out.) Such entries are therefore not considered as victims, just like quiescent, swapped out pages.
WITH vE: victimE
SELECT InOut[victimE]
FROM
out => NULL;
in => {
vRefed: BOOL = vE.state.flags.referenced;
vDirty: BOOL = vE.state.flags.dirty OR rmE.needsBackingStoreWrite;
SELECT
TRUE
FROM
vE.real ~= current => Crash[];
vRefed
AND firstPass => {
On the first pass, we convert referenced pages to unreferenced pages, but don't reclaim them.
AgeInternal[victimP, vE];
IF ~vDirty THEN secondPassWorthWhile ← TRUE;
};
vDirty =>
We remember the first unreferenced dirty page or, if none, the first referenced dirty page. Of course, an unreferenced page may be referenced by the time (and if) it actually becomes a victim.
SELECT dirtyVictimState
FROM
none => {
dirtyVictimState ←
IF vRefed THEN referenced ELSE unreferenced;
dirtyVictimRP ← current;
dirtyVictimDataState ←
IF vE.state.flags.dirty THEN changed ELSE rmE.dataState;
};
unreferenced => NULL;
referenced =>
IF ~vRefed
THEN {
dirtyVictimState ← unreferenced;
dirtyVictimRP ← current;
dirtyVictimDataState ←
IF vE.state.flags.dirty THEN changed ELSE rmE.dataState;
};
ENDCASE;
ENDCASE => {
This page is clean and, if this is the first pass, it is unreferenced. We've found our victim.
victim ← [realPage: current, body: clean[]];
SetVMMap[victimP,
[state: PrincOpsUtils.PageStateFromFlags[PrincOps.flagsVacant], body: out[
checkedOut: FALSE,
readOnly: vE.state.flags.readonly,
dataState: rmE.dataState]]
];
PrincOpsUtils.EnableInterrupts[];
--*stats*--
IF firstPass THEN rmOldClean ← rmOldClean.SUCC
ELSE rmNewClean ← rmNewClean.SUCC;
EXIT
};
};
ENDCASE;
PrincOpsUtils.EnableInterrupts[];
};
rmE: pinned RMMapEntry => NULL;
ENDCASE;
ENDLOOP;
allocationRover ← current; -- advance round-robin pointer
};
rmMap[victim.realPage] ←
IF pin
THEN
RMMapEntry[dataState: tVM.dataState, needsBackingStoreWrite:
FALSE,
body: pinned[pinCount: 1]]
ELSE
RMMapEntry[dataState: tVM.dataState, needsBackingStoreWrite:
FALSE,
body: reclaimable[virtual: vmPage]];
--*stats*-- rmReclamations ← rmReclamations.SUCC;
};
in => Crash[]; -- already has real memory
ENDCASE;
END;
ENDLOOP;
};
AgeInternal:
PROC [vmPage: PageNumber, vmE: in VMMapEntry] =
INLINE {
This procedure must be executed with the monitor held and interrupts disabled (to avoid losing the "referenced" bit. (Because of the peculiar structure of InitializeAllocateRealMemoryInternal, it is awkward to declare this INTERNAL.
IF vmE.state.flags.readonly
AND vmE.state.flags.dirty
THEN {
rmMap[vmE.real].needsBackingStoreWrite ← TRUE;
vmE.state.flags.dirty ← FALSE;
rmMap[vmE.real].dataState ← changed;
};
vmE.state.flags.referenced ← FALSE;
SetVMMap[vmPage, vmE];
};
GetCheckedInVMMap:
INTERNAL
PROC [vmPage: PageNumber]
RETURNS [vmEntry: VMMapEntry] = INLINE {
firstTime: BOOL ← TRUE;
DO
vmEntry ← GetVMMap[vmPage];
WITH vmE: vmEntry
SELECT InOut[vmEntry]
FROM
in => EXIT;
out =>
A note in VMInternal.VMMapEntry explains the following non-intuitive test.
IF ~vmE.checkedOut OR vmE.dataState = none THEN EXIT;
ENDCASE;
--*stats*--
IF firstTime THEN {checkoutConflicts ← checkoutConflicts.SUCC; firstTime ← FALSE};
WAIT checkIn;
ENDLOOP;
};
Initialization
InitializeTables[];
Initialize the allocator coroutine.
[] ← InitializeAllocateRealMemoryInternal[];
END.