[Cyan]<CedarChest6.1>Cypress>DBStorage.mesa!2

File DBStorage.mesa

Last edited by:

MBrown on December 16, 1982 2:46 pm

Cattell on September 14, 1983 4:12 pm

Willie-Sue on February 21, 1985 6:09:37 pm PST

Widom, September 3, 1985 2:58:54 pm PDT

Donahue, April 11, 1986 2:32:46 pm PST

DIRECTORY

AlpineEnvironment USING [LockOption],

DBCommon,

Rope USING [ROPE];

DBStorage: CEDAR DEFINITIONS = BEGIN

ROPE: TYPE = Rope.ROPE;

SegmentID: TYPE = DBCommon.SegmentID;

VersionOptions: TYPE = DBCommon.VersionOptions;

Segment: TYPE = DBCommon.Segment;

SegmentIndex: TYPE = DBCommon.SegmentIndex;

This interface contains all procedures defined by the storage level and used by the

tuple level.

CONSTANTS and TYPES

TuplesetObjectSize: CARDINAL = 20;

IndexObjectSize: CARDINAL = 20;

FieldObjectSize: CARDINAL = 5;

Sizes of data structures that are shared between storage and tuple levels: tupleset, index, and field "objects". Apart from this size information, these objects are opaque above the storage level. FieldObjects do not change unless DeleteField is done, so these may be cached by the tuple level; FieldObjects are stored in the database, but pointers to in-core copies are always passed to the storage level. Tupleset and Index objects may change due to tuple creation, index insertion, etc; these are stored in the database and accessed directly by the storage level, using the convention that the first-created field in a tuple containing a Tupleset or Index object must contain the object, and have the agreed-upon size.

TupleHandle: TYPE = DBCommon.TupleHandle;

TupleObject: TYPE = DBCommon.TupleObject;

TuplesetHandle: TYPE = TupleHandle;

The first-created field of this tuple is an NWord field of length TuplesetObjectSize, which is used to store the tupleset object (containing information about the tupleset such as the size of a tuple and the list of pages that contain tuples from this tupleset). A tupleset object is never copied, it is always updated in place.

SystemTuplesetHandle: TYPE = REF SystemTuplesetObject;

SystemTuplesetObject: TYPE;

Defined in DBStorageConcrete.

SystemTupleID: TYPE = LONG CARDINAL; -- Really [1..MaxSystemTupleID]

MaxSystemTupleID: CARDINAL = 128;

Supplied as an argument to CreateSystemTupleset. CreateSystemPageTuple assigns this number as tupleset ID, and whenever a tupleset ID is turned into a tuple, it is range-checked to see if it is a system tupleset.

FieldDescriptor: TYPE = RECORD[

variantPart: SELECT basicType: FieldClass FROM

OneWord,TwoWord => NULL,

NWord => [length: CARDINAL],

Number of words in the field

VarWord,VarByte => [lengthHint: CARDINAL],

Number of words/bytes that can be stored inline in the field

Group => [groupID: GroupID],

A GroupID is a TupleHandle. The TID contained therein is considered to be the unique ID of the Group field. Normally this is the TID of a dictionary tuple that describes this attribute. In the case of the attributes of the dictionary tuples themselves, a SystemTupleID is used; this points to no real tuple.

ENDCASE

];--FieldDescriptor

A FieldDescriptor is used only as a parameter to AddField.

FieldClass: TYPE = {OneWord, TwoWord, NWord, VarWord, VarByte, Group};

FieldHandle: TYPE = REF FieldObject;

FieldObject: TYPE;

ListOfFieldHandle: TYPE = LIST OF FieldHandle;

The FieldHandle is used to do field read/writes. The interface implies that a field handle actually contains the FieldDescriptor that was used to create it, as well as other stuff such as the offset within the tuple, etc. The FieldObject is allocated by a call to CreateFieldHandle, then filled in by ReadNWord.

TuplesetScanHandle: TYPE = REF TuplesetScanObject;

TuplesetScanObject: TYPE;

Defined in DBStorageConcrete.

GroupID: TYPE = TupleHandle;

GroupScanHandle: TYPE = REF GroupScanObject;

GroupScanObject: TYPE;

Defined in DBStorageConcrete.

IndexHandle: TYPE = TupleHandle;

The first-created field of this tuple is an NWord field of length IndexObjectSize, which is used to store the index object (information about the index such as the location of the root page). An index object is never copied, it is always updated in place.

IndexScanHandle: TYPE = REF IndexScanObject;

IndexScanObject: TYPE;

Defined in DBStorageConcrete.

Selection: TYPE = RECORD[lowerBound, upperBound: Rope.ROPE,

includeLowerBound, includeUpperBound: BOOLEAN,

lowerBoundInfinity, upperBoundInfinity: BOOLEAN ← FALSE];

Notes on allocation of storage for interface objects

FieldDescriptor, Selection: Alloc/Dealloc by tuple level.

FieldHandle: Created by storage level, copied to database by tuple level, then deallocated by tuple level (who thus must know correct pool). Copied from database, then alloc/dealloc by tuple level.

Initialization

Initialize: PROC [nCachePages: NAT, cacheFileName: ROPE];

Initializes the database system. Use nCachePages pages for cache size.

Segment/Transaction interface

AttachSegment: PROC [fileName: ROPE, s: Segment, segmentIndex: SegmentIndex, lock: AlpineEnvironment.LockOption, readonly: BOOL, version: VersionOptions, nPagesInitial, nPagesPerExtent: NAT] RETURNS[newAttachment: BOOL];

This operation "attaches" given segment to a given section of the database system's internal address space. At most one segment may be attached to a particular section.

When a transaction is opened on this segment, using OpenTransaction (below), the following actions occur. The file with full path name fileName is looked up on the appropriate server. If the file is not there and version = OldFileOnly, ERRORs DBException.BadOperation[FileNotFound]; if it is there and version = NewFileOnly, ERRORs DBException.BadOperation[FileAlreadyExists]. Otherwise, creates a new segment or returns an existing one with the given name, as appropriate. In creating a new segment, at least nPagesInitial pages are allocated, and nPagesPerExtent pages will be allocated whenever the newly-created segment is extended. If readonly, then any subsequent request to write into the segment will raise an ERROR.

Things are organized in this peculiar way so that an application may attach segments just once, even though it may execute many transactions (some of which may abort) on the segments. Note the lack of a "DetachSegment" proc. If the attachment overrides a previous one (or is completely new), newAttachment will be TRUE.

MakeTransaction: PROC[server: Rope.ROPE] RETURNS[t: DBCommon.Transaction];

Return a transaction on the given server. This can be used to build a TransactionHandle to be passed to OpenTransaction below

OpenTransaction: PROC [segment: Segment, handle: DBCommon.TransactionHandle, eraseAfterOpening: BOOL ← FALSE];

Open a transaction for the segment, attaching the segment to the transaction handle provided. If eraseAfterOpening is TRUE, then the segment is emptied after opening and the transaction is marked (committed)

FinishTransaction: PROC [handle: DBCommon.TransactionHandle, abort: BOOL, continue: BOOL];

If NOT abort AND continue, then all segments using the indicated transaction continue to have a transaction, but updates to these segments have been committed.

Tupleset interface

The procedures in this group deal with the definition and manipulation of the entire set of tuples of a given type. Some involve either implicit or explicit sequencing through a tuple set; the search order is system-defined.

1) Definition.

CreateTupleset: PROC [x: TuplesetHandle];

Creates a new tupleset, whose tuples have no fields, in the same segment as x. The parameter x is an existing tuple whose first-created field is an NWord field, TuplesetObjectLength words long. Information describing the new tupleset is placed in this field. Tuple x is returned by a ReadTupleset call (see next page) on any tuple created in this tupleset. Note that the TuplesetHandle is used primarily by CreateTuple, but is also modified by AddField/DeleteField.

CreateSystemTupleset: PROC [x: SystemTupleID] RETURNS [SystemTuplesetHandle];

Creates a new system tupleset (system tuplesets do not live in a single segment and pages containing their tuples are not chained together at storage level).

SetSystemTupleTable: PROC [REF ARRAY [1..MaxSystemTupleID] OF TupleHandle];

The system tuple table defines TupleHandles that represent the system tuples. There are two cases in which a storage level operation needs to return a system tuple. These are ReadTupleset and GetGroupIDs, applied to a dictionary tuple. In these cases the storage level is careful to check for a TID in the range [1..MaxSystemTupleID], and when one is encountered the corresponding entry from the system tuple table is returned.

AddField: PROC [x: TuplesetHandle, y: FieldDescriptor] RETURNS [FieldHandle];

In effect this procedure modifies each tuple of set x, adding a new field of type y. It returns a FieldHandle for the new field.

AddSystemField: PROC [x: SystemTuplesetHandle, y: FieldDescriptor] RETURNS [FieldHandle];

This procedure adds to x the length of y. Returns a FieldHandle for the new field.

CreateFieldHandle: PROC RETURNS [FieldHandle];

Since the FieldObject type is opaque, this procedure is needed to allocate a FieldHandle, to be passed to ReadNWord when retrieving a FieldHandle stored in the data schema.

2) Navigation.

We provide a collection of functions to scan through a tupleset and return all of the tuples that satisfy a predicate called a selection. Informally, a selection is a disjunction of comparisons (equality, inequality, or range) on some subset of fields.

FirstLast: TYPE = DBCommon.FirstLast;

OpenScanTupleset: PROC [x: TuplesetHandle, start: FirstLast] RETURNS [TuplesetScanHandle];

Initializes for a sequential search through all tuples in set x. Starts scan either before first tuple or after last, according to start parameter.

NextScanTupleset: PROC [x: TuplesetScanHandle] RETURNS [TupleHandle];

Returns the next tuple in the scan, or NIL when the scan is empty. The TuplesetScanHandle is modified as a side-effect of the Next call.

PrevScanTupleset: PROC [x: TuplesetScanHandle] RETURNS [TupleHandle];

Returns the previous tuple in the scan, or NIL when the scan is empty. The TuplesetScanHandle is modified as a side-effect of the Prev call.

CloseScanTupleset: PROC [x: TuplesetScanHandle];

Destroys the TuplesetScanHandle x.

FirstTuple: PROC[x: TuplesetHandle] RETURNS[TupleHandle];

Returns the first tuple in the tuple set (ie., the first reached by doing an OpenScanTupleset followed by a Next)

EmptyTupleSet: PROC [x: TuplesetHandle] RETURNS[yes: BOOL];

Returns TRUE iff the tuple set x currently has not tuples in it; this is a more efficient version of:

scanHandle: DBStorage.TuplesetScanHandle = DBStorage.OpenScanTupleset[ts, First];

yes ← DBStorage.NextScanTupleset[scanHandle] = NIL;

DBStorage.CloseScanTupleset[scanHandle]

Tuple interface: The procedures in this group deal with individual tuples.

1) Creation and destruction.

CreateTuple: PROC [x: TuplesetHandle] RETURNS [newTuple: TupleHandle];

Attempts to create a new tuple of type x.

CreateSystemPageTuple: PROC [x: SystemTuplesetHandle, y: TupleHandle, s: Segment] RETURNS [newTuple: TupleHandle];

Attempts to create a new system tuple of type x on the same page as existing system tuple y. If this is not possible, the new tuple placed on an otherwise empty page in the same segment as tuple y. If y = NIL then the new tuple is placed on an otherwise empty page in segment s. Note that s is completely ignored when y # NIL, and that unlike CreateTuple, this procedure ALWAYS creates a tuple.

DestroyTuple: PROC [x: TupleHandle];

Destroys tuple x, then performs FreeTupleHandle[x] (see below).

FreeTupleHandle: PROC [x: TupleHandle];

Logical no-op and not necessary, though may free some storage structures.

NullTuple: PROC[t: TupleHandle] RETURNS[yes: BOOL];

TRUE iff the tuple handle is NIL or has been nullified

ConsSystemTupleObject: PROC [] RETURNS [TupleHandle];

Allocates a new TupleObject with no particular contents.

2) Field access.

We provide individual routines for each value type. Two parameters are common to most of the routines in this section: x, the tuple to be accessed, and f, a FieldHandle describing the position within tuple x containing the value to be read or written.

ReadTupleset: PROC [x: TupleHandle] RETURNS [TuplesetHandle];

Returns a tuple handle for the tuple representing x's tupleset.

Read1Word: PROC [x: TupleHandle, f: FieldHandle] RETURNS [CARDINAL];

Read2Word: PROC [x: TupleHandle, f: FieldHandle] RETURNS [LONG CARDINAL];

ReadNWord: PROC [x: TupleHandle, f: FieldHandle, space: REF ANY];

Returns an n-word record, where n is encoded in f, storing it into the locations space..space+n-1 (beware!).

ReadVarByte: PROC [x: TupleHandle, f: FieldHandle] RETURNS [Rope.ROPE];

Write1Word: PROC [x: TupleHandle, f: FieldHandle, val: CARDINAL];

Write2Word: PROC [x: TupleHandle, f: FieldHandle, val: LONG CARDINAL];

WriteNWord: PROC [x: TupleHandle, f: FieldHandle, val: REF ANY];

The val parameter points to an n-word record, where n is known from f.

WriteVarByte: PROC [x: TupleHandle, f: FieldHandle, val: Rope.ROPE];

Group interface

A group is an ordered list of tuples, used in representing one-many relationships. The relationship is between the group's head, which is a single tuple, and the group itself. The relationship is named; we say that "x belongs to a g-group" if x has a TID field with groupID g, and this field has a non-null value z; z is the head of the group. The groupID g is bound into the field handle f of a TID field when it is created. Thus it also makes sense to call a group an f-group when f is a TID field handle. One tuple may be head of an unlimited number of groups, but only one g-group for any groupID g.

A GroupScanHandle is a handle on a group. It specifies a position that is either before the first tuple, between two consecutive tuples, or after the last tuple. It is used both for scanning through a group and for specifying a position to insert a new tuple in a group.

The tuple level is responsible for using the storage level primitives in ways that preserve stronger invariants. Note also that the storage level does not implement a set structure for groups. Thus it is not possible for the storage level to sequence through all of the groups of a given type, as it can do for tuples.

ReadTID: PROC [--x-- TupleHandle, --f-- FieldHandle] RETURNS [--z-- TupleHandle];

If x belongs to an f-group, ReadTID returns the head of the group; otherwise it returns NIL.

GetGroupIDs: PROC [--z-- TupleHandle] RETURNS [LIST OF GroupID];

Returns a list containing all GroupIDs g such that z is the head of a g-group. The Tuple level is responsible for mapping from GroupIDs to FieldHandles, so that the next procedure can be called when necessary.

OpenScanGroup: PROC [--z-- TupleHandle, --f-- FieldHandle, --start-- FirstLast]

RETURNS [--r-- GroupScanHandle];

Returns a GroupScanHandle positioned before the first (or after the last) tuple of the f-group that has tuple z as head (first or last depending upon the value of start). Returns a non-NIL GroupScanHandle even if there is no group.

WriteTID: PROC [--x-- TupleHandle, --r-- GroupScanHandle];

Makes tuple x join the group specified by r at r's current position (i.e. between the tuple just read and the next tuple that would be read by NextInGroup). A NextInGroup following WriteTID returns the tuple following x; a PrevInGroup returns x itself. Note that: (1) if x's field y had a non-NIL value before the WriteTID call, then WriteTID has the effect of deleting x from some group (using WriteTIDNIL below), (2) WriteTID cannot delete a tuple from one group without inserting it into another, so WriteTIDNil below is necessary for that case, and (3) inserting or deleting a tuple from a group requires the storage level to check all outstanding GroupScanHandles to assure that they stay consistent with the deletion.

WriteTIDNil: PROC [--x-- TupleHandle, --f-- FieldHandle];

Makes x's TID field point to the null tuple, i.e. removes x from whatever f-group it currently belongs.

NextInGroup: PROC [--r-- GroupScanHandle] RETURNS [--x-- TupleHandle];

Returns the tuple x immediately following the position r in a group, and moves r one position forwards. Returns NIL if r was already positioned after the last tuple in the group.

PrevInGroup: PROC [--r-- GroupScanHandle] RETURNS [--x-- TupleHandle];

Returns the tuple x immediately preceding the position r in a group, and moves r one position backwards. Returns NIL if r was already positioned before the first tuple in the group.

CloseScanGroup: PROC [--r-- GroupScanHandle];

TupleForField: PROC[x: TupleHandle, f: FieldHandle] RETURNS[handle: TupleHandle];

This procedure is an efficient means of getting the unique tuple in the database that is the only element in the f-group having x as its head (This is the group analogue of TupleForKey)

SetTupleField: PROC[x: TupleHandle, f: FieldHandle, val: TupleHandle];

This operation is a more efficient version of the following frequent special case:

groupHandle: DBStorage.GroupScanHandle = DBStorage.OpenScanGroup[val, f, Last];

DBStorage.WriteTID[t, groupHandle];

DBStorage.CloseScanGroup[groupHandle]

The group interface may be expanded in the future to allow searches by value within a group, especially if we choose to support ordered groups.

Index interface

1) Definition. An index implements a map from byte string keys to doubleword values. The byte strings are ordered in the obvious way, and the index allows efficient range queries on key values.

Note that the tuple level is responsible for mapping tuple field values into keys that can be compared using the simple-minded routine in the storage level, and for ensuring that all of and only the tuples of a single type are actually entered into an index.

RootIndicesFromSegment: PROC [s: Segment] RETURNS [indices: ARRAY[0..DBCommon.systemIndexCount) OF IndexHandle];

Each segment has a small collection of builtin index handles

CreateIndex: PROC [x: IndexHandle];

Creates a new index. x is a tuple whose first-created field is an NWord field, IndexObjectSize words long. Information about the index is stored in this field. Pages for the index are obtained from the segment containing tuple x.

DestroyIndex: PROC [x: IndexHandle];

Destroys the index x.

2) Maintainance. These routines allow insertion and deletion of associations in an index.

InsertIntoIndex: PROC [x: IndexHandle, k: Rope.ROPE, v: TupleHandle];

Associates value v with key k .

DeleteFromIndex: PROC [x: IndexHandle, k: Rope.ROPE, v: TupleHandle];

Deletes the given association from the index (v is necessary since multiple values may be present for the same key). ERRORs DBException.InternalBug[...] if the association <k, v> is not present in the index x.

3) Navigation. Two means of navigating through indices are provided, either by using the Next and Previous operators to return tuple handles or by enumerating all of the keys in a certain range and then selecting those tuples that are of interest.

OpenScanIndex: PROC [x: IndexHandle, y: Selection, startPosition: FirstLast] RETURNS [IndexScanHandle];

NextScanIndex: PROC [x: IndexScanHandle] RETURNS [tuple: TupleHandle];

PrevScanIndex: PROC [x: IndexScanHandle] RETURNS [tuple: TupleHandle];

PrevScanKey: PROC[x: IndexScanHandle] RETURNS [key: DBCommon.IndexKey];

NextScanKey: PROC[x: IndexScanHandle] RETURNS [key: DBCommon.IndexKey];

ThisScanTuple: PROC[x: IndexScanHandle] RETURNS [tuple: TupleHandle];

IgnoreEntry: PROC[x: IndexScanHandle];

TupleForKey: PROC [x: IndexHandle, key: Rope.ROPE] RETURNS[tuple: TupleHandle];

This will return the first tuple in the index that has the given key; it is the client's responsibility to ensure that the key appears in the index only once

CloseScanIndex: PROC [x: IndexScanHandle];

END. -- DBStorage