Private "cool" procedures
The following procedures are logically local to InsertRecords and are not called anywhere else. They are separated out because they are infrequently called and might be packaged separately, should we ever decide to package this code at all.
MakeNewRoot:
PROCEDURE [tree: Tree, pathStk: PathStk] =
Makes a new root page given a pathStk now at level 0 and with a non-empty ESL.
BEGIN
wordsToInsert: CARDINAL = EntryIntervalSize[pathStk: pathStk, leftFather: 0];
newRootPage: PageNumber = tree.AllocatePage[];
pagePtr: BTreePagePtr = tree.ReferencePage[newRootPage, write];
IF tree.state.depth >= maxLevelsInTree THEN ERROR Error[depthExceeded];
pagePtr.minPage ← pathStk.path[0].leastSon;
tree.ReleasePage[newRootPage];
IF wordsToInsert > tree.maxFreeWords THEN ERROR Bug[newRootOverflow];
WritePage[tree: tree, pse: @pathStk.path[0], number: newRootPage, words: wordsToInsert];
tree.state.rootPage ← newRootPage;
tree.state.depth ← tree.state.depth+1;
END;
ComplexInsertRecords:
PROCEDURE [tree: Tree, pathStk: PathStk]
RETURNS [rtBroPg1: PageNumber] =
Called when not all the entries will fit on the current page. All of this page's entries have been extracted into the ESL for this level. Tries to spill over onto the right brother page, or onto the left brother page if there isn't a right brother, or onto a new page if neither brother exists. Returns rtBroPg1=nilPage if this is successful. Otherwise, repositions the current level of the pathStk (if necessary) so that a right brother exists, and returns the right brother's page number. This procedure is responsible for redistributing the entries among the two pages so as to minimize size of the entry promoted to the father page. Note that this considers only brothers and not cousins or more distant relatives.
BEGIN
pse: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top];
fatherPSE: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top-1];
entryTable: REF EntryTable = pathStk.entryTable;
oneBrotherEnough: BOOLEAN ← FALSE;
fatherIndex, bestFatherIndex: EntryOrdinal;
bestFatherSize: CARDINAL ← tree.maxFreeWords+1;
fatherPSE.leastSon ← pse.pageNumber; -- in case this is the root page splitting
rtBroPg1 ← nilPage;
IF pathStk.top>1
THEN
BEGIN
rtBroPg1 ← FindRightBrother[tree: tree, pathStk: pathStk, spaceNeeded: -tree.maxFreeWords];
IF rtBroPg1=nilPage
THEN
This may look strange, but see the comment below
rtBroPg1 ← FindLeftBrother[tree: tree, pathStk: pathStk, spaceNeeded: -tree.maxFreeWords];
END;
IF rtBroPg1=nilPage THEN rtBroPg1 ← tree.AllocatePage[];
At this point, we have two pages in hand, pse.pageNumber and rtBroPg1. All of their entries have been extracted into the ESL, so they may be considered blank pages. We will use rtBroPg1 as the right brother of the current page regardless of whether it was formerly the right brother, the left brother, or newly allocated.
IF entryTable.length<3 THEN ERROR Bug[tooFewEntries]; -- there must be at least one entry each from this page, the brother page, and the father page
fatherIndex ← FillLeftPage[tree: tree, pathStk: pathStk];
The idea next is to send the shortest entry into the father page such that the current page is at least "pretty" full (if we have such a choice).
DO
pl0, pl1, fatherSize: CARDINAL;
pl1 ← EntryIntervalSize[pathStk: pathStk, leftFather: fatherIndex];
IF pl1 > tree.maxFreeWords THEN EXIT;
pl0 ← EntryIntervalSize[pathStk: pathStk, rightFather: fatherIndex];
IF pl0=0 OR pl0+pl1 > tree.maxFreeWords+tree.awfullyFull THEN EXIT;
Still enough room in right brother page. See if this is the shortest father entry, and try moving one more entry into right brother page.
fatherSize ← IndexedEntrySize[pathStk: pathStk, index: fatherIndex];
IF fatherSize<bestFatherSize
THEN
BEGIN
bestFatherIndex ← fatherIndex;
bestFatherSize ← fatherSize;
oneBrotherEnough ← TRUE;
END;
fatherIndex ← fatherIndex-1;
ENDLOOP;
IF oneBrotherEnough
THEN
BEGIN
breakSize: CARDINAL = EntryIntervalSize[pathStk: pathStk, rightFather: bestFatherIndex];
totalSize: CARDINAL = EntryIntervalSize[pathStk: pathStk];
WritePage[tree: tree, pse: pse, number: pse.pageNumber, words: breakSize];
PushEntSeqRecord[pse: fatherPSE, esr: WriteRightBrother[tree: tree, pse: pse, rtBroPg: rtBroPg1, words: totalSize-breakSize]];
rtBroPg1 ← nilPage;
END;
END;
HairyInsertRecords:
PROCEDURE [tree: Tree, pathStk: PathStk, rtBroPg1: PageNumber] =
Called when not all the entries will fit on the current page and the right brother page. Pours all the entries into the current and right brother pages and either the second right brother page or the left brother page, creating a new second right brother page if neither exists or there is still not enough space. This procedure is responsible for redistributing the entries among the three pages so as to minimize the sum of sizes of the entries promoted to the father page. Note that this considers only brothers and not cousins or more distant relatives.
BEGIN
AddToHeap:
PROCEDURE [entry: EntryOrdinal] =
BEGIN
heap.length ← heap.length+1;
TrickleDown[emptyIndex: heap.length, entry: entry];
END; -- AddToHeap
RemoveFromHeap:
PROCEDURE [entry: EntryOrdinal] =
BEGIN
heapPos: HeapIndex = entryTable.map[entry].heapPos;
heap.length ← heap.length-1;
IF heapPos <= heap.length
THEN
BEGIN
replacementEntry: EntryOrdinal = heap.entries[heap.length+1];
IF IndexedEntrySize[pathStk: pathStk, index: replacementEntry] <= IndexedEntrySize[pathStk: pathStk, index: entry]
THEN TrickleDown[emptyIndex: heapPos, entry: replacementEntry]
ELSE SiftUp[emptyIndex: heapPos, entry: replacementEntry];
END;
END; -- RemoveFromHeap
TrickleDown:
PROCEDURE [emptyIndex: HeapIndex, entry: EntryOrdinal] =
BEGIN
sonSize: CARDINAL = IndexedEntrySize[pathStk: pathStk, index: entry];
son: HeapIndex ← emptyIndex;
DO
father: HeapIndex ← son/2;
fatherEnt: EntryOrdinal;
IF father<=0 THEN EXIT;
fatherEnt ← heap.entries[father];
IF IndexedEntrySize[pathStk: pathStk, index: fatherEnt] <= sonSize THEN EXIT;
heap.entries[son] ← fatherEnt;
entryTable.map[fatherEnt].heapPos ← son;
son ← father;
ENDLOOP;
heap.entries[son] ← entry;
entryTable.map[entry].heapPos ← son;
END; -- TrickleDown
SiftUp:
PROCEDURE [emptyIndex: HeapIndex, entry: EntryOrdinal] =
BEGIN
entrySize: CARDINAL = IndexedEntrySize[pathStk: pathStk, index: entry];
DO
son: HeapIndex ← emptyIndex*2;
sonEntry: EntryOrdinal;
IF son > heap.length THEN EXIT;
sonEntry ← heap.entries[son];
IF son < heap.length
AND IndexedEntrySize[pathStk: pathStk, index: heap.entries[son+1]] < IndexedEntrySize[pathStk: pathStk, index: sonEntry]
THEN
{ son ← son+1; sonEntry ← heap.entries[son] };
IF IndexedEntrySize[pathStk: pathStk, index: sonEntry] >= entrySize THEN EXIT;
heap.entries[emptyIndex] ← sonEntry;
entryTable.map[sonEntry].heapPos ← emptyIndex;
emptyIndex ← son;
ENDLOOP;
heap.entries[emptyIndex] ← entry;
entryTable.map[entry].heapPos ← emptyIndex;
END; -- SiftUp
entryTable: REF EntryTable = pathStk.entryTable;
heap: REF Heap = pathStk.heap;
pse: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top];
fatherPSE: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top-1]; -- father's pse
rtBroPg2: PageNumber;
fatherIndex, fatherIndex2, bestFatherIndex, bestFatherIndex2: EntryOrdinal;
minFeasIndex, maxFeasIndex: EntryOrdinal;
bestFatherSizeSum: CARDINAL ← 2*tree.maxFreeWords + 1;
twoBrothersEnough: BOOLEAN ← FALSE;
breakSize1, breakSize2, totalSize: CARDINAL;
fatherESR: REF EntSeqRecord;
See how much free space our second brother page would have to contain in order to handle the overflow. This is done by pretending to fill up this page and the first right brother page and seeing what is left over.
fatherIndex ← FillLeftPage[tree: tree, pathStk: pathStk];
fatherIndex2 ← FillLeftPage[tree: tree, pathStk: pathStk, leftFather: fatherIndex];
The current page can't be the root, because one brother would surely have been enough in that case; so we don't have to pussyfoot when calling FindRightBrother.
rtBroPg2 ← FindRightBrother[tree: tree, pathStk: pathStk, spaceNeeded: EntryIntervalSize[pathStk: pathStk, leftFather: fatherIndex2] + 2*tree.breathingSpace];
IF rtBroPg2=nilPage
THEN
BEGIN -- no luck, try the left brother
fe2: EntryOrdinal = FillRightPage[tree: tree, pathStk: pathStk];
fe: EntryOrdinal = FillRightPage[tree: tree, pathStk: pathStk, rightFather: fe2];
rtBroPg2 ← FindLeftBrother[tree: tree, pathStk: pathStk, spaceNeeded: EntryIntervalSize[pathStk: pathStk, leftFather: 0, rightFather: fe] + 2*tree.breathingSpace];
IF rtBroPg2=nilPage THEN rtBroPg2 ← tree.AllocatePage[] -- still no luck, allocate new page
ELSE
BEGIN
-- left brother had space, but fatherIndexes are now invalid
fatherIndex ← FillLeftPage[tree: tree, pathStk: pathStk];
fatherIndex2 ← FillLeftPage[tree: tree, pathStk: pathStk, leftFather: fatherIndex];
END;
END;
IF entryTable.length<5 THEN ERROR Bug[tooFewEntries]; -- there must be two entries from the father page and at least one entry each from this page and the two brother pages
Now figure out how to divide the entries among the three pages in a way that minimizes the sum of the sizes of the two entries sent to the father page while attempting to keep the pages at least "fairly full". The way this is done is as follows. The left cut point (fatherIndex) is swept leftward from its initial maximum possible value, and all possible right cut points for the initial left cut point are thrown into a heap ordered by entry size. As the left cut point moves left, some possible right cut points are added and some are removed. At each step, the minimum-size entry for the right cut point is on the top of the heap. The sum of that and the entry for the left cut point is computed and the minimum remembered.
heap.length ← 0;
maxFeasIndex ← fatherIndex2;
WHILE EntryIntervalSize[pathStk: pathStk, leftFather: maxFeasIndex] <= tree.fairlyFull
DO
maxFeasIndex ← maxFeasIndex-1;
ENDLOOP;
minFeasIndex ← maxFeasIndex+1;
WHILE EntryIntervalSize[pathStk: pathStk, rightFather: fatherIndex] > (
IF twoBrothersEnough
THEN tree.prettyFull
ELSE 0)
DO
WHILE EntryIntervalSize[pathStk: pathStk, leftFather: fatherIndex, rightFather: minFeasIndex-1] > 0
AND EntryIntervalSize[pathStk: pathStk, leftFather: minFeasIndex-1] <= tree.maxFreeWords
DO
minFeasIndex ← minFeasIndex-1;
IF minFeasIndex <= maxFeasIndex THEN AddToHeap[minFeasIndex];
ENDLOOP;
WHILE EntryIntervalSize[pathStk: pathStk, leftFather: fatherIndex, rightFather: maxFeasIndex] > tree.maxFreeWords
DO
IF maxFeasIndex >= minFeasIndex THEN RemoveFromHeap[maxFeasIndex];
maxFeasIndex ← maxFeasIndex-1;
ENDLOOP;
IF heap.length>0
THEN
BEGIN
fatherSizeSum: CARDINAL;
fatherIndex2 ← heap.entries[1];
fatherSizeSum ← IndexedEntrySize[pathStk: pathStk, index: fatherIndex] + IndexedEntrySize[pathStk: pathStk, index: fatherIndex2];
IF fatherSizeSum<bestFatherSizeSum
THEN
BEGIN
twoBrothersEnough ← TRUE;
bestFatherSizeSum ← fatherSizeSum;
bestFatherIndex ← fatherIndex;
bestFatherIndex2 ← fatherIndex2;
END;
END;
fatherIndex ← fatherIndex-1;
ENDLOOP;
IF ~twoBrothersEnough THEN ERROR Bug[twoBrothersNotEnough];
Write the three pages and promote the two father entries to the next level.
breakSize1 ← EntryIntervalSize[pathStk: pathStk, rightFather: bestFatherIndex];
breakSize2 ← EntryIntervalSize[pathStk: pathStk, rightFather: bestFatherIndex2];
totalSize ← EntryIntervalSize[pathStk: pathStk];
WritePage[tree: tree, pse: pse, number: pse.pageNumber, words: breakSize1];
fatherESR ← WriteRightBrother[tree: tree, pse: pse, rtBroPg: rtBroPg1, words: breakSize2-breakSize1];
PushEntSeqRecord[pse: fatherPSE, esr: WriteRightBrother[tree: tree, pse: pse, rtBroPg: rtBroPg2, words: totalSize-breakSize2]];
PushEntSeqRecord[pse: fatherPSE, esr: fatherESR];
END;
FindRightBrother:
PROCEDURE [tree: Tree, pathStk: PathStk, spaceNeeded:
INTEGER]
RETURNS [rtBroPg: PageNumber] =
Finds the right brother of the current page, and determines whether it has room for at least spaceNeeded additional words. If so, removes the father entry and all right brother entries and appends them to the ESL for this level. Returns nilPage if there is no right brother or it is too full. Passing a spaceNeeded argument of -tree.maxFreeWords will find the right brother if it exists, regardless of how full it is.
BEGIN
pse: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top];
fatherPSE: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top-1];
fatherEntSize: CARDINAL;
pagePtr: BTreePagePtr;
fatherESR, rtBroESR: REF EntSeqRecord;
IF fatherPSE.eslFront=
NIL
THEN
BEGIN
pagePtr ← tree.ReferencePage[fatherPSE.pageNumber];
IF fatherPSE.offset = nilOffset+(tree.state.pageSize-pagePtr.freeWords)
THEN
{ tree.ReleasePage[fatherPSE.pageNumber]; RETURN [nilPage] }; -- no right brother
fatherEntSize ← tree.BTreeEntrySize[@pagePtr[fatherPSE.offset]];
rtBroPg ← pagePtr[fatherPSE.offset].grPage;
tree.ReleasePage[fatherPSE.pageNumber];
END
ELSE
BEGIN
fatherEntSize ← tree.BTreeEntrySize[fatherPSE.eslFront.entSeqP];
rtBroPg ← fatherPSE.eslFront.entSeqP.grPage;
END;
pagePtr ← tree.ReferencePage[rtBroPg];
IF
LOOPHOLE[pagePtr.freeWords-fatherEntSize,
INTEGER] < spaceNeeded
THEN
{ tree.ReleasePage[rtBroPg]; RETURN [nilPage] }; -- right brother too full
rtBroESR ← MakeEntSeqRecord[entSeq: @pagePtr.entries, length: tree.maxFreeWords-pagePtr.freeWords];
tree.ReleasePage[rtBroPg];
[esr: fatherESR] ← tree.RemoveEntry[pse: fatherPSE];
AppendEntSeqLengths[tree: tree, pathStk: pathStk, esr: fatherESR];
AppendEntSeqRecord[pse: pse, esr: fatherESR];
AppendEntSeqLengths[tree: tree, pathStk: pathStk, esr: rtBroESR];
AppendEntSeqRecord[pse: pse, esr: rtBroESR];
END;
FindLeftBrother:
PROCEDURE [tree: Tree, pathStk: PathStk, spaceNeeded:
INTEGER]
RETURNS [ltBroPg: PageNumber] =
Finds the left brother of the current page, and determines whether it has room for at least spaceNeeded additional words. If so, backs up one entry at the father's level, removes the father entry and all left brother entries, and inserts them at the front of the ESL for this level. Returns nilPage if there is no left brother or it is too full. Passing a spaceNeeded argument of -tree.maxFreeWords will find the left brother if it exists, regardless of how full it is.
BEGIN
pse: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top];
fatherPSE: LONG POINTER TO PathStkEntry = @pathStk.path[pathStk.top-1];
fatherPagePtr, ltBroPagePtr, rtBroPagePtr: BTreePagePtr;
fatherESR, ltBroESR: REF EntSeqRecord;
fatherEntSize: CARDINAL;
rtBroOfLtBroPg: PageNumber;
IF fatherPSE.offset <= entry1Offset THEN RETURN [nilPage];
fatherPagePtr ← tree.ReferencePage[fatherPSE.pageNumber];
ltBroPg ← fatherPagePtr[fatherPSE.nextToLastOffset].grPage;
rtBroOfLtBroPg ← fatherPagePtr[fatherPSE.lastOffset].grPage;
fatherEntSize ← tree.BTreeEntrySize[@fatherPagePtr[fatherPSE.lastOffset]];
tree.ReleasePage[fatherPSE.pageNumber];
ltBroPagePtr ← tree.ReferencePage[ltBroPg];
IF
LOOPHOLE[ltBroPagePtr.freeWords-fatherEntSize,
INTEGER] < spaceNeeded
THEN
{ tree.ReleasePage[ltBroPg]; RETURN [nilPage] };
ltBroESR ← MakeEntSeqRecord[entSeq: @ltBroPagePtr.entries, length: tree.maxFreeWords-ltBroPagePtr.freeWords];
fatherPagePtr ← tree.ReferencePage[fatherPSE.pageNumber, write];
fatherPagePtr[fatherPSE.nextToLastOffset].grPage ← rtBroOfLtBroPg;
tree.ReleasePage[fatherPSE.pageNumber];
[esr: fatherESR] ← tree.BackUpAndRemoveEntry[pse: fatherPSE];
rtBroPagePtr ← tree.ReferencePage[rtBroOfLtBroPg, write];
fatherESR.entSeqP.grPage ← rtBroPagePtr.minPage;
rtBroPagePtr.minPage ← ltBroPagePtr.minPage;
tree.ReleasePage[rtBroOfLtBroPg];
tree.ReleasePage[ltBroPg];
PushEntSeqLengths[tree: tree, pathStk: pathStk, esr: fatherESR];
PushEntSeqRecord[pse: pse, esr: fatherESR];
PushEntSeqLengths[tree: tree, pathStk: pathStk, esr: ltBroESR];
PushEntSeqRecord[pse: pse, esr: ltBroESR];
END;
WriteRightBrother:
PROCEDURE [tree: Tree, pse:
LONG
POINTER
TO PathStkEntry, rtBroPg: PageNumber, words:
CARDINAL]
RETURNS [fatherESR:
REF EntSeqRecord] =
Removes words' worth of entries from the front of the ESL for this level, and writes all but the first entry into rtBroPg. Designates the first entry as the (left) father of rtBroPg, and returns a new ESR containing it. Also sets the page's freeWords and minPage fields appropriately.
BEGIN
pagePtr: BTreePagePtr;
minPage: PageNumber;
[esr: fatherESR, grPage: minPage] ← tree.RemoveEntry[pse: pse];
words ← words-fatherESR.entSeqLen;
pagePtr ← tree.ReferencePage[rtBroPg, write];
pagePtr.minPage ← minPage;
tree.ReleasePage[rtBroPg];
WritePage[tree: tree, pse: pse, number: rtBroPg, words: words];
fatherESR.entSeqP.grPage ← rtBroPg;
END;
WritePage:
PROCEDURE [tree: Tree, pse:
LONG
POINTER
TO PathStkEntry, number: PageNumber, words:
CARDINAL] =
Removes words' worth of entries from the front of the ESL for this level, and writes them into the page designated by number. Sets the page's freeWords appropriately, but does not touch minPage.
BEGIN
pagePtr: BTreePagePtr = tree.ReferencePage[number, write];
DepositESL[tree: tree, pse: pse, block: @pagePtr.entries, length: words];
pagePtr.freeWords ← tree.maxFreeWords-words;
tree.ReleasePage[number];
END;
IndexedEntrySize:
PROCEDURE [pathStk: PathStk, index: EntryOrdinal]
RETURNS [words:
CARDINAL] =
INLINE
{ RETURN [EntryIntervalSize[pathStk: pathStk, leftFather: index-1, rightFather: index+1]] };
FillLeftPage:
PROCEDURE [tree: Tree, pathStk: PathStk, leftFather, rightFather: EntryOrdinal ← 0]
RETURNS [midFather: EntryOrdinal] =
Finds the largest entry ordinal in (leftFather .. rightFather) such that all the entries in (leftFather .. midFather) will fit in one BTree page. If rightFather = 0 then it is defaulted to pathStk.entryTable.length+1.
BEGIN
IF rightFather=0 THEN rightFather ← pathStk.entryTable.length+1;
midFather ← leftFather+2;
WHILE midFather<rightFather-2
AND EntryIntervalSize[pathStk: pathStk, leftFather: leftFather, rightFather: midFather+1] <= tree.maxFreeWords
DO
midFather ← midFather+1;
ENDLOOP;
END;
FillRightPage:
PROCEDURE [tree: Tree, pathStk: PathStk, leftFather, rightFather: EntryOrdinal ← 0]
RETURNS [midFather: EntryOrdinal] =
Finds the smallest entry ordinal in (leftFather .. rightFather) such that all the entries in (midFather .. rightFather) will fit in one BTree page. If rightFather = 0 then it is defaulted to pathStk.entryTable.length+1.
BEGIN
IF rightFather=0 THEN rightFather ← pathStk.entryTable.length+1;
midFather ← rightFather-2;
WHILE midFather>leftFather+2
AND EntryIntervalSize[pathStk: pathStk, leftFather: midFather-1, rightFather: rightFather] <= tree.maxFreeWords
DO
midFather ← midFather-1;
ENDLOOP;
END;
END.