Heading:qjk40(635)\gOpenFileMap and LogMap, version 1y756qk40\gPage Numbers: Yes  X: 527  Y: 10.5"qjk40\gInter-Office Memorandumz18592l4445y762\f5bgTo	Alpine project	Date	September 8, 1981z18592l4445d2998e21(0,65535)(1,4445)(5,11684)(6,13664)\f1g2f0t2 1t0 14t6 1f1t0 4f0t7 1t0From	Ed Taft	Location	PARC/CSLz18592l4445d2998e25\f1g4f0t2 1t0 7t6 1f1t0 8f0t7 1t0Subject	OpenFileMap and LogMap,	File	[Ivy]<Alpine>Doc>FileLogMap1.bravoversion 1z21167l4445d2998e25\f1g7f0t2 1t0 23t6 1f1t0 4t7 1t0 34f0XEROX       z18592l508e14(2116)\f2g5f0Attributes:	informal, draft, technical, Alpine, distributed computing, filingz18697l4896d2999e10(0,4904)(1,65535)(5,65535)(6,65535)\f1gAbstract:	OpenFileMap and LogMap are internal Alpine interfaces supporting FileStore.  They maintain volatile state information required for efficient access to files and to the log.z18697l4896d2999e10j\f1gz18697e10j(1270)\gThe OpenFileMap and LogMap are volatile structures supporting the implementation of the Alpine FileStore.  They are accessed through internal interfaces that are not exported (via RPC) to Alpine clients.z18697e10jk40\gThe OpenFileMap maintains state information associated with an OpenFileHandle, which represents a single client's access to a single file under a single transaction.z18697e10jk40\gThe LogMap maintains, in an efficiently accessible form, the differences between the abstract state of the FileStore (including both committed and uncommitted transactions) and the actual contents of the base file system.  The LogMap is an entirely redundant version of the Log, and much of its contents are simply references to Log entries.z18697e10jk40\gOpenFileMapz18697e22k80\bg11BOpening a file, as presently specified in FileStore, serves two purposes:z18697e10jk40\g1.	It defines a point at which file access control and whole-file locking are performed.z18697l4269d3739e10jk40\g2.	It associates a brief handle, the OpenFileMap.Handle, with a considerable amount of state that is thereafter kept by the server and need not be supplied in each call.  This state includes the transaction ID, the client's identity, access rights to the file, and any additional per-file state that FileStore may wish to retain from one file operation to the next.z18697l4269d3739e10jk40\gIt is the function of OpenFileMap to maintain this state information.  However, OpenFileMap understands only the syntax and not the semantics of the state information.  It is the caller's responsibility to authenticate the ClientID, verify the access rights, register the lock mode with the Locks manager, etc.z18697e10jk40\gInterfacez18697e16k80\ig9IOpenFileMap: DEFINITIONS =BEGINz18697l3008e10k40\f8gHandle: TYPE [2];z18697l3424d2999e10k40\f8gA Handle must be unique for all time, to enable detection of dangling references.  However, it is specific to a particular FileStore instance, so it need not be a full UniqueID.  A Handle is not a capability: all uses of it must be validated by the caller, by calling Validate (below), before calling any other procedures exported via OpenFileMap.z18697l4269e10jk40\gConversationID: TYPE = RPC.ConversationID;z18697l3424d2999e10k40\f8gThe handle associated with a (potentially) secure conversation with some client.  It is maintained by RPC and appears as an argument of each procedure exported to remote clients by FileStore.z18697l4269e10jk40\gCreate: PROCEDURE [conversation: ConversationID, client: ClientID, trans: TransID, volume: VolumeID, file: FileID] RETURNS [Handle];z18697l3424d2999e10k40\f8gThe caller asserts that conversation represents an authenticated conversation with the client denoted by ClientID.  The other arguments are not checked but are simply remembered in state associated with the returned Handle.  Remaining fields of that state are initialized to reasonable default values (e.g., no locks, no access, etc.)z18697l4269e10jk40\g24i12IPassing a VolumeID may or may not be necessary, depending on our decision whether or not to provide a way to replicate a file with the same FileID as the original.z18697l4269e10jk40\f1gValidate: PROCEDURE [handle: Handle, conversation: ConversationID];z18697l3424d2999e10k40\f8gRaises a signal if handle is invalid or if conversation does not match the ConversationID associated with handle.  This is intended to be be called upon entry to every procedure exported from FileStore that operates on an open file.  Its implementation is expected to be efficient.z18697l4269e10jk40\g19i6I18i12I51i6IDestroy: PROCEDURE [handle: Handle];z18697l3424d2999e10k40\f8gImmediately destroys the Handle and associated state information.  Should be called when a transaction has been committed or aborted.  (Do we need to keep a file open between committing and completing the transaction?  I think not.)z18697l4269e10jk40\gSetLockMode: PROCEDURE [handle: Handle, lockMode: LockMode];z18697l3424d2999e10k40\f8gSetAccessRights: PROCEDURE [handle: Handle, accessRights: AccessRights];z18697l3424d2999k40\f8g... etc., for setting state information that may vary while a file is open.  (These are perhaps not good examples.)z18697l4269e10jk40\gGetClientID: PROCEDURE [handle: Handle] RETURNS [ClientID];z18697l3424d2999e10k40\f8gGetFileID: PROCEDURE [handle: Handle] RETURNS [FileID];z18697l3424d2999k40\f8g... etc., to retrieve state information associated with the open file.z18697l4269e10jk40\gEnumerate: PROCEDURE [proc: EnumerationProc];z18697l3424d2999e10k40\f8gEnumerateTransaction: PROCEDURE [proc: EnumerationProc, trans: TransID];z18697l3424d2999e10k40\f8gEnumerationProc: TYPE = PROCEDURE [handle: Handle];z18697l3424d2999e10k40\f8gFor each entry in the OpenFileMap (Enumerate), or for each entry corresponding to trans (EnumerateTransaction), calls proc[handle].z18697l4269e10jk40\g82i5I31i4IEnumerateTransaction is used to find all Handles belonging to a transaction; its implementation is expected to be efficient.  It is used twice when preparing to commit a transaction: first to flush (via the FilePageMgr) all pages of files written by the transaction, and then to destroy all the Handles themselves.z18697l4269e10jk40\gEND.z18697l3705d2999e10k40\f8gImplementationz18697e16k80\ig14IIt is possible to assign realistic yet relatively modest bounds on the total storage required for the OpenFileMap; e.g., 200 simultaneous clients, with an average of 20 open files each, at 50 words per OpenFileMap entry (a wildly generous allowance), amounts to only 200K words.  Therefore it is reasonable to maintain this data structure entirely in virtual memory, presumably using Cedar collectible storage for the values.z18697e10jk40\g270f1 1f0The map representation should be highly efficient for lookup, since a lookup must be performed on nearly every FileStore operation.  Efficiency of insertion and deletion are of secondary importance, since those operations are performed only for OpenFile and Commit/Abort.  There is no ordering relationship among Handles.  Consequently, a hash table is probably the best data structure for the OpenFileMap.  (Since the representation of a Handle is private to the implementation, it can use low-order bits as an index into the hash table.)z18697e10jk40\gLogMapz18697e22k80\bg6BThe LogMap serves two purposes:z18697e10jk40\g1.	It keeps track of all the logged intentions required to carry out a transaction after it is committed.  Consulting the LogMap for this purpose is logically equivalent to searching the log for all the transaction's entries, but is intended to be much more efficient.z18697l4269d3739e10jk40\g2.	It maintains information required to divert or modify FileStore actions when the abstract state of the FileStore is represented in the log rather than in the base file system.  The most common example of such a diverted action is a read of data previously written in the same transaction.z18697l4269d3739e10jk40\gFor the first purpose, the LogMap needs to be remember only the existence of a log entry.  Consequently, every deferred action written into the log must be recorded in the LogMap.z18697e10jk40\g64i9I32i73IThe LogMap could also record entries for immediate actions, i.e., those that must be undone if the transaction is aborted.  This would eliminate the need to search the log (backward) for such entries.  We believe that aborting a transaction is sufficiently uncommon that the cost of such searches is acceptable.z18697e10jk40\f1gFor the second purpose, the LogMap additionally needs to remember things about the log entry.  That is, it needs to understand enough about the semantics of the action specified by the log entry so that it can maintain an up-to-date model of the virtual state of the FileStore.z18697e10jk40\g73i5IThe decision whether or not the LogMap understands about certain types of entries is somewhat arbitrary.  We might consider structuring the LogMap as two levels, the lower of which just registers uninterpreted log entries and the upper of which understands the semantics of log entries.z18697e10jk40\f1gAssumptions and limitationsz18697e16k80\ig27IIn its present form, the LogMap deals only with log entries for file-level activity.  It might be extended to deal with higher-level (logical) logging, but I don't have a good understanding of what is required at higher levels.z18697e10jk40\gThis design of the LogMap maintains at most two virtual'' states of any given piece of the FileStore: the one seen by the (single) transaction currently updating that piece, and the one seen by all other transactions.  That is, if an intention to change some piece of the FileStore exists in the log, and the transaction has not been either carried out or aborted, and a new request is made to access the same piece of the FileStore, then a client participating in the transaction should see the state recorded in the log, while any other client should see the state of the base.z18697e10jk40\gThis arrangement is intended to support update mode'' transactions, where reads and writes do not conflict at the time they are requested but rather are serialized at commit time.  However, there must be at most one deferred write pending to any given piece of the FileStore, so that the LogMap need not maintain arbitrarily many virtual'' states.  This depends on the correct operation of Locks.z18697e10jk40\gThis version of LogMap assumes only full-page updates; but extending it to handle partial-page updates should be straightforward.z18697e10jk40\gInterfacez18697e16k80\ig9ILogMap: DEFINITIONS =BEGINz18697l3008e10k40\f8gLogID: TYPE = Log.ShortLogRecordID;z18697l3424d2999e10k40\f8gGeneral types and operations.  These are for registering uninterpreted log entries whose semantics are not known to the LogMap.z18697e16jk80\bg29BEntityKey: TYPE = RECORD [file: FileID,SELECT type: * FROM  filePage => [page: PageNumber],  property => [property: Property],  size => NULL,  ...  ENDCASE];z18697l3424d2999e10k40\f8gWe refer to each independently updatable piece of the FileStore (file page, file property, etc.) as an Entity (need a better term??).  For the purposes of the LogMap, every Entity is addressable by means of an EntityKey.  For every distinguishable type of deferred file-level log record there must be a distinct EntityKey.type.z18697l4269e10jk40\g110b22BLocation: TYPE = {base, log};z18697l3424d2999e10k40\f8gThe current state of any given Entity (as viewed under some transaction) is located either in the base or in the log; the LogMap's primary function is to remember which.z18697l4269e10jk40\g98i4I11i3IRegisterUpdate: PROCEDURE [entity: EntityKey, trans: TransID, logID: LogID];z18697l3424d2999e10k40\f8gRegisters with the LogMap a log entry, identified by logID, for performing a deferred update to entity under transaction trans.z18697l4269e10jk40\g53i5I38i6I19i5IIf RegisterUpdate is called multiple times on the same entity during one transaction, the latest update overrules the earlier ones in the sense that it is the only one accessible via LocateEntity (below); however, the earlier updates remain in the LogMap and are revealed by EnumerateTransaction (below).  Exception: if the LogMap knows enough about the semantics of the updates to determine that a later update has entirely overruled the effects of an earlier one, then the earlier LogMap entry is deleted entirely.z18697l4269e10jk40\gIt is illegal to register multiple updates to the same entity under different transactions.z18697l4269e10jk40\gAn important consequence of these semantics is that after a transaction is committed, any deferred update to an Entity under that transaction must be carried out before any new updates to the same Entity may be registered.  It is the caller's responsibility to ensure this.  Perhaps LogMap should export a procedure to assist this.z18697l4269e10jk40\f1gLocateEntity: PROCEDURE [entity: EntityKey, trans: TransID] RETURNS [location: Location, logID: LogID];z18697l3424d2999e10k40\f8gDetermines the location of the current state of entity as viewed by transaction trans.  If the LogMap contains no entry for entity, or if the last update to entity was under a different transaction than trans and that transaction has not been committed, then returns location=base; logID is undefined.  If the last update to entity was under the same transaction as trans or that transaction has been committed, returns location=log; logID identifies the log entry for that update.z18697l4269e10jk40\g48i6I26i5I39i6I27i6I40i5I59i8I7i5I38i6I35i5I49i8I6i5IEnumerateTransaction: PROCEDURE [proc: EnumerationProc, trans: TransID];z18697l3424d2999e10k40\f8gEnumerationProc: TYPE = PROCEDURE [logID: LogID] RETURNS [delete: BOOLEAN];z18697l3424d2999e10k40\f8gFor each entry in the LogMap corresponding to transaction trans, calls proc[logID], where logID identifies the log record.  After proc returns, if delete=TRUE then the entry is deleted from the LogMap.  The LogMap is enumerated in the order in which the entries were originally registered (i.e., in ascending order of logID).z18697l4269e10jk40\g58i5I8i4I15i5I35i4I13i6I1f1 4f0EnumerateTransaction is used by FileStore when carrying out the deferred actions of a committed transaction.  In this context, proc should return delete=TRUE always; delete=FALSE may be useful for checkpointing transactions.z18697l4269e10jk40\g127i4I15i6I1f1 4f0 9i6I1f1 5f0Specialized types and operations.  These are for registering log entries whose semantics are known to the LogMap.  This specialization is useful for two reasons.  First, interactions between updates can be taken account of.  Second, significant optimizations in internal representation are possible; for example, compound updates to consecutive file pages can be represented as single LogMap entries.z18697e16jk80\bg33BPageRun: TYPE = FilePageMgr.PageRun;-- = RECORD [firstPage: PageNumber, count: CARDINAL _ 1];z18697l3424d2999e10k40\f8gRegisterWritePages: PROCEDURE [file: FileID, pageRun: PageRun, trans: TransID, logID: LogID];z18697l3424d2999e10k40\f8gRecords in the LogMap the fact that data intended to be written to file at pageRun has been entered into the log as a record identified by logID.  FileStore should call this procedure with the result returned by Log.RecordDeferredPageWrite.z18697l4269e10jk40\g67i4I4i7I57i5IWriting pages previously written during the same transaction is OK; for any given page, the LogMap remembers only the LogID of the most recent write.z18697l4269e10jk40\g131i11IMappedPageRun: TYPE = RECORD [SELECT location: Location FROM  base => [pageRun: PageRun],  log => [logID: LogID, offset: PageCount],  ENDCASE];z18697l3424d2999e10k40\f8gLocateReadPages: PROCEDURE [file: FileID, pageRun: PageRun, trans: TransID] RETURNS [LIST OF MappedPageRun];z18697l3424d2999e10k40\f8gDetermines whether the data described by file and pageRun, as viewed under transaction trans, resides in the base or in the log.  The result MappedPageRun can then be used as an argument to FilePageMgr.ReadPages or Log.PageRead, depending on variant.z18697l4269e10jk40\g41i4I5i7I30i5ISince part of the run might reside in the base and part in the log, LocateReadPages may break the run into pieces and return multiple MappedPageRuns.  Also, since the run might begin in the middle of a log record, the log variant of MappedPageRun includes a page offset.z18697l4269e10jk40\gLocateReadPages consults only the LogMap, not the base.  Thus it will not detect errors such as reading from a nonexistent file or page in the base; the FilePageMgr is expected to detect such errors.  However, it will detect and raise a signal for errors such as reading from a file that has been deleted according to the log, since in this case the abstract state of the FileStore differs from the base.z18697l4269e10jk40\g213i4IRegisterSize: PROCEDURE [file: FileID, pageCount: PageCount, trans: TransID, logID: LogID];z18697l3424d2999e10k40\f8gRecords in the LogMap a deferred change in the size of a file; pageCount is the new size.  Deleting a file is registered as setting its size to zero.  If the file's size is decreased, LogMap may forget earlier deferred writes to the deleted portion of the file.z18697l4269e10jk40\g63i9I119i3ILocateSize: PROCEDURE [file: FileID, trans: TransID] RETURNS [location: Location, pageCount: PageCount, logID: LogID];z18697l3424d2999e10k40\f8gLocates the size of file as viewed under transaction trans.  If location=base, the file's size is in the base, and pageCount and logID are undefined.  If location=log, pageCount is the current virtual size of the file and logID identifies the log record that most recently changed it.z18697l4269e10jk40\g20i4I29i5I6i8I43i9I5i5I20i8I6i9I45i5IEND.z18697l3705d2999e10k40\f8gImplementationz18697e16k80\ig14ISince every update operation can potentially give rise to a new LogMap entry, it is impossible to assign reasonable bounds to the total size of the LogMap without restricting the number of file operations per transaction.  On the other hand, good performance is crucial, so the implementation of the LogMap must be very efficient, at least for the most common cases.z18697e10jk40\gThis argues for a hybrid implementation that uses an efficiently manageable internal data structure for well-behaved transactions (those will relatively few deferred updates), but that automatically overflows to some less efficient external data structure (e.g., a B-tree) when necessary.z18697e10jk40\gThe domain of the LogMap is unordered at the file level but ordered below the file level; and references to the LogMap can be expected to exhibit a high degree of locality.  Lookups, insertions, and deletions must all be relatively efficient.  These considerations suggest the use of a hash table for FileIDs, with each entry rooting some sort of tree in which the sub-file EntityKeys are stored.  The hash table and the internal tree (e.g., a red-black tree) are mapped directly onto VM.  The overflow B-tree is built on an external file; access to that file is managed by the FilePageMgr.z18697e10jk40\gThe LogMap entries for a transaction must be maintained in a list so that they may be enumerated in order of LogID.  Additionally, a separate map keyed by TransID must be kept, to anchor these lists.  It's likely that we will need to keep other things on a per-transaction basis as well; so a separate TransactionMap interface may be called for.z18697e10jk40\gLogMap is a module monitor that protects the private LogMap data structures.  Additionally, the caller of procedures such as LocateReadPages is assumed to have acquired a lock on the current transaction to ensure the continued validity of information about entity location.z18697e10jk40\gMiscellaneous remarksz18697e22k80\bg21BA number of the TYPEs now defined in FileStore (e.g., PageRun, Property) will need to be moved to some independent definitions module, to avoid circular compilation dependencies. z18697e10jk40\g16f1 4f0It seems to me that we need a Log.PageRead operation that reads the data written by Log.RecordDeferredPageWrite.  FileStore needs this to perform ReadPages for data located in the log rather than the base.z18697e10jk40\gPerhaps the various LogMap procedures that accept both a FileID and a LogID should instead be passed an OpenFileMap.Handle.  However, there isn't any other reason for the LogMap to know about open files, so maybe this isn't a good idea.z18697e10jk40\gWe need to develop an Alpine-wide policy on how to do enumerations.  The examples here accept an enumeration procedure that is called for every instance; but the Pilot-style stateless enumeration'' may be better.  (The main issue is that of monitor locks.)z18697e10jk40\gChange historyz18697e22k80\bg14BSeptember 8, 1981.  OpenFileMap: Create and Validate take a ConversationID as an argument, and the caller is expected to call Validate before calling any other procedures.  LogMap: Only deferred actions are kept in the LogMap.  Clients inside and outside a transaction may see different Locations for an EntityKey.  EnumerateTransaction enumerates in the forward direction only.  Implementation uses a hash table and (normally) an internal tree, rather than a B-tree.z18697e10jk40\g20b11B142b6B