CSL Notebook Entry
To Alpine users Date June 7, 1983
From Mark Brown and Ed Taft Location Palo Alto
Subject Alpine public interfaces (version 9) Organization PARC/CSL
XEROX
Release as [Indigo]<Cedar>Documentation>AlpineInterfaces.tioga
Draft [Indigo]<Alpine>Doc>AlpineInterfaces.tioga
Last edited by Taft, June 7, 1983 5:36 pm
Abstract
These are the lowest-level public interfaces to the Alpine file system. This memo describes the functions provided by Alpine at this level, and accompanies the Mesa definitions files themselves in describing the interfaces AlpineEnvironment, AlpineFile, AlpineTransaction, and several others.
See the end of this memo for a change history and a list of unfinished business. <We use angle brackets to enclose comments, like this one, that should disappear once uncertainty in the design is resolved.>
1. Introduction
This memo documents the lowest-level file system interfaces seen by Alpine clients. The Alpine interfaces are intended to support higher-level facilities such as the Cedar data base and the Cedar file system, and not to support application-level clients directly. Multiple instances of these interfaces may be imported by a single client, some representing local file systems and others representing remote servers that are accessed through remote procedure calls (RPC).
An Alpine file system consists of a single log, used to implement atomic transactions on data, and a set of logical disk volumes containing files. A log is stored on disk for online recovery from client-aborted transactions and soft failures, and is transferred to offline storage for use by the backup system in case of hard media failures. A volume may be quiesced and moved from one server machine to another, or stored offline for extended periods.
These interfaces are the only ones seen by clients of an Alpine server. This includes clients that may share the server machine, such as a directory or data base system.
2. Organization
All the interfaces and other files required by a client of Alpine are made available via [Indigo]<Cedar>Top>AlpineUser.df. The primary documentation for the Alpine interfaces is the interface source files themselves, which are heavily commented. The document you are reading is intended principally as a high-level overview of Alpine's concepts and facilities, though it also contains more detailed elaborations on the semantics of some of the operations. (If you are reading this document in Press form, you will find listings of the public interfaces at the end.)
There are two layers of public Alpine interfaces. The ones in the lower layer are the ones directly exported (remotely or locally) by an Alpine instance itself. When using these interfaces, the client program is required to interact directly with the RPC facilities, to deal with the complexities of binding to multiple instances of the same interfaces (when necessary), and to explicitly manage various Alpine objects such as transactions and open files.
The higher layer is a thin veneer over the lower one called the AlpineUser package. It is implemented by the AlpineUserImpls configuration, which runs on the client machine. It hides most of the complexities of RPC and binding, and it provides automatically managed object-oriented access to all the operations exported by an Alpine instance. Most client programs are expected to use the AlpineUser interfaces instead of the direct Alpine interfaces. Nevertheless, the direct Alpine interfaces are of interest to client programmers because they alone contain the authoritative documentation of the semantics of the Alpine operations.
2.1. Alpine interfaces
An Alpine instance directly exports four public Alpine interfaces. The interfaces AlpineFile and AlpineTransaction contain all the operations most clients will require. AlpineOwner and AlpineVolume consist primarily of administrative operations, though there are a few that may usefully be called by ordinary clients. (Alpine exports other interfaces as well, but they are intended only for administrators or wizards and are not documented here.)
Additionally, there is a public definitions file, AlpineEnvironment, that declares public types, constants, and other things used in one or more of the interfaces. These definitions are also declared (indirectly) in the interfaces that use them, so most clients should not need to refer to AlpineEnvironment directly; thus, for example, AlpineTransaction.TransID and AlpineFile.TransID are the same type, which is actually defined in AlpineEnvironment. This is likewise true of SIGNALs and ERRORs with the same name in multiple interfaces; for example, AlpineFile.Unknown and AlpineOwner.Unknown have the same value, so enabling one will also catch occurrences of the other.
AlpineFile provides all operations on files themselves, including creating and deleting files, reading and writing data, and manipulating any of a fixed set of file properties.
AlpineTransaction contains everything to do with transactions. An Alpine server may be asked to be the coordinator for a new transaction or a worker for an existing one; this interface includes operations for both purposes.
AlpineOwner deals with a data structure called the owner data base, whose main purpose is to provide a client root file and administrative controls such as disk space accounting.
AlpineVolume provides operations for interrogating and (for qualified clients) changing the set of volumes in operation on a server.
For each procedure in each interface, the list of SIGNALs and ERRORs that the procedure can raise is given as a stylized comment introduced by !'' immediately following the declaration. For signals that take enumerated arguments, the possible values of the arguments are given (but are omitted if all the values are possible). Additionally, RPC.CallFailed can be raised during any remote call; see section 3.4.
2.2. AlpineUser package
The AlpineUser package exports three main interfaces of interest to ordinary clients. They are distinguished from the direct Alpine interfaces in that their names begin with Alp'' rather than Alpine''.
AlpInstance contains an operation for obtaining a handle on a specific Alpine instance, local or remote. It manages the interactions with RPC and deals with all the problems of dynamic instantiation of interfaces.
AlpTransaction contains the same operations as AlpineTransactions. However, the operation that creates a new transaction returns a handle for a local object which embodies both the transaction itself and its associated AlpineInstance handle; and this handle is passed to the other operations. AlpTransaction also contains the operations defined in AlpineOwner and AlpineVolume.
AlpFile likewise parallels AlpineFile, but the operations that open or create a file return a handle for a local object representing the file, the transaction, and the Alpine instance.
3. Accessing Alpine
The public interfaces exported from an Alpine server are accessed using the Cedar Remote Procedure Call (RPC) facility. For comprehensive information on this facility, you should read the Lupine User's Guide [Lupine, 1982]. Authentication of user credentials depends on Grapevine, and is provided more-or-less automatically by RPC. Control of access to data stored in the server is based on the caller's membership in Grapevine groups associated with the data.
The Alpine system can also be instantiated on a local workstation and used to manage files there. In this situation, the same public interfaces are used, but calls to them do not involve RPC. Authentication is automatic, but access control works just the same as in the remote case.
Sections 3.1 and 3.2 below describe how to access the Alpine interfaces directly. Section 3.3 describes how to access Alpine via the AlpineUser package, which is what most clients are expected to do. Since the AlpineUser package takes care of RPC binding and conversation establishment, an understanding of the first two sections is not essential for using the AlpineUser package, though it may be helpful when things go wrong. The information in the subsequent sections applies regardless of whether Alpine is being called directly or via the AlpineUser package.
3.1. Binding
A client binds to a local stub implementation of a remotely-exported interface using the normal Mesa facilities (binder, loader, etc.) For example, the stub implementation for AlpineFile is called AlpineFileRpcClientImpl, and is instantiated on the client machine.
The stub implementation does not correspond to any real server instance until the client invokes the appropriate RPC runtime machinery. Each stub implementation, in addition to exporting the intended interface, also exports an auxiliary interface used to invoke RPC runtime operations on behalf of that stub. For example, corresponding to the AlpineFile interface there exists an AlpineFileRpcControl interface.
Each stub exports an ImportInterface procedure to its RpcControl interface which, when called, causes the stub to be bound to a specific instance of a real implementation identified by an RPC.InterfaceName passed as an argument. By convention, the InterfaceName.instance specifies the Grapevine RName of the desired Alpine server instance (sometimes referred to as a FileStore); the type specifies i.alpine'', where i is the interface name (for example, AlpineFile.alpine''); and the version is the (manually-maintained) version number of the interface, which changes only when incompatible changes are made.
Note that each remote interface must be imported in this way; so, for example, to import AlpineFile and AlpineTransaction from a particular Alpine server requires invoking the ImportInterface procedure in each of the corresponding RpcControl interfaces.
A runtime binding may be broken by calling the UnimportInterface procedure, which causes the local and remote RPCs to forget about the association between the client and implementation. A client that knows it is finished dealing with a particular instance of the interface should invoke this to conserve resources in the RPC machinery.
The foregoing description assumes static binding to exactly one instance of an Alpine interface, e.g., by means of a Mesa .config file or the simple binding performed by the loader (though the determination of which instance of the interface is being used need not be made until execution time). Dynamic instantiation of arbitrary numbers of instances (both local and remote) is substantially more complicated. The implementation of the AlpineUser package provides one example of how this may be done.
3.2. Conversation establishment
Calling ImportInterface is strictly a runtime binding operation; it does not give rise to any explicit communication with the server. Before calling any Alpine procedure, it is first necessary to establish a conversation with the server machine. The conversation embodies the authenticated identities of both the client and the server, and it is also the basis for (optional) encryption-based secure communication.
A conversation is established by calling RPC.StartConversation, passing the client's RName and password (ordinarily obtained from Cedar's UserCredentials interface) and the server's RName. StartConversation first communicates with a Grapevine authentication server to obtain a conversation key that will be known only to the client and server RPC machinery, and it associates a newly-allocated RPC.Conversation (whose purpose will be explained shortly) with this key. This is all that is required to carry on a secure, authenticated conversation with the server [Needham & Schroeder, 1978].
Conversation maintenance is facilitated by the following RPC convention: if the first argument of a remote procedure call is of type RPC.Conversation, then RPC will ensure that the communication between client and server machines (for that call) is encrypted by the key corresponding to the Conversation. Thus, all procedures exported by Alpine interfaces require a Conversation as their first argument. This Conversation serves two purposes: it causes RPC to encrypt the communication, as just explained; and on the server side it identifies the client of each call.
A few remarks may be made about this organization. A Conversation represents nothing more than an authenticated communication path between a pair of principals; there is no inherent association between conversations and anything else (interfaces, processes, transactions, etc.) So a single Conversation can be (and typically will be) used to make calls to multiple remote interfaces, such as the AlpineFile and AlpineTransaction interfaces exported by one Alpine server. A single remote implementation can support conversations with multiple clients, each identified by its own Conversation. Multiple client programs on the same machine can communicate with a given Alpine server using the same or different Conversations, depending on whether they are mutually cooperative or suspicious.
Calling RPC.EndConversation causes the remote and local RPCs to forget about a conversation (i.e., about an association between Conversation and conversation key). A client that knows it is finished with a conversation should invoke this to conserve resources in the RPC machinery.
<Need to say something here about local Conversations, in particular, how they get hooked up with the identity of the caller in ClientMap.>
3.3. AlpineUser package operation
To access Alpine via the AlpineUser package, the first step is always to call AlpInstance.Create. The only required argument is the name of the Alpine instance (or FileStore) to be accessed. The name Local.alpine'' refers to the Alpine instance on the local machine, if one exists. Optional arguments are the name and password of the caller; ordinarily these should be defaulted, in which case the credentials of the logged-in user (obtained from UserCredentials) are used.
The successful result of this operation is an AlpInstance.Handle, which refers both to a bound set of interfaces to that Alpine instance and to an authenticated RPC conversation. This handle may in turn be passed to several of the procedures described below, either for performing operations or for obtaining other, more specialized handles. The object referenced by the handle is automatically managed and will be reclaimed after all references to it are gone.
The next step is usually to create a transaction. AlpTransaction.Create takes an AlpInstance.Handle, creates a transaction on the designated Alpine instance and returns an AlpTransaction.Handle that designates both the Alpine instance and the transaction. The other operations in AlpTransaction all take an AlpTransaction.Handle as their first argument. This means that those operations are members of the Handle's cluster'', so they may be invoked with object notation; for example, if
h is a Handle then the transaction may be committed by:
h.Finish[commit];
All operations on files require first that the file be opened. The procedures AlpFile.Open and AlpFile.Create take an AlpTransaction.Handle, along with various other arguments, and return an AlpFile.Handle that designates the Alpine instance, the transaction, and the file. The other operations in AlpFile take such a handle as their first argument and may be invoked with object notation as described above.
AlpTransaction.Handles and AlpFile.Handles are managed automatically in the sense that they designate Cedar reclaimable objects, so no explicit freeing operation is required. However, when such an object is reclaimed because no references to it exist, but the transaction or open file corresponding to it is still active, no change occurs in the state of that transaction or open file (i.e., it is not automatically terminated or closed). <Perhaps the implementation should establish finalization of these objects.>
3.4. Calling
The Alpine public interfaces restrict themselves to the procedure call semantics imposed by RPC. In particular, address-containing arguments and results are dereferenced by RPC (to a single level) and passed by value. (This is complicated by the VALUE and RESULT conventions, discussed in section 4.4.) ROPEs and LISTs are flattened and sent in their entirety. Exported variables and the type REF ANY are not supported. Procedure arguments are not supported now, though they may be eventually. SIGNALs and ERRORs work across remote interfaces in the normal way, with one variation. In order to be passed through to the client, a signal must be defined in the interface that exports the procedure raising the signal. Special measures have been taken in the Alpine implementation to ensure that a signal with a given name is exported with the same value from every interface that can raise it.
The signal RPC.CallFailed may be raised by any remote call; it indicates a breakdown in communication or in the binding with the remote Alpine instance. This will not be mentioned in the interface descriptions in the remainder of this document. CallFailed indicates a persistent failure (that is, the RPC machinery has already tried very hard to make the call succeed); so in general it is not appropriate for the client simply to retry the call. Recovery at higher levels of client software may succeed if it reestablishes the binding to the same or a different Alpine instance, perhaps after intervention by the human user.
Nearly all procedures in the public interfaces are safe'' in the Cedar sense. However, the interfaces directly exported from Alpine are not declared as CEDAR DEFINITIONS because this declaration is not yet supported by RPC; so all the procedures are in fact UNSAFE at the present time and must be called from within TRUSTED or UNCHECKED blocks. On the other hand, the AlpineUser package interfaces are CEDAR DEFINITIONS, and all procedures defined in them, with the exception of AlpFile.ReadPages and .ReadPagesWritePages, are SAFE.
3.5. Access control
Access to files and other objects is controlled by access control lists, which consist of Grapevine RNames that may be individuals or groups. Client membership in an access control list is determined by first checking whether the client's RName is present in the list and then consulting Grapevine to determine whether the client's RName is a member of any of the groups in the list.
Each file has two access control lists that control reading and modifying that file. These access control lists are file properties that may be manipulated via the AlpineFile interface, described in section 4. An access control list is simply a list of RNames which are ordinarily Grapevine individuals or groups. There is a special pseudo-RName World'', which includes any authenticated client as a member. The amount of space available for storing access control lists is fixed for each file. Alpine guarantees that there is enough room for an average of two maximum-length RNames per access control list, but does not dictate the distribution of space among the lists.
Each file has an owner property, which is a single name (usually but not necessarily an RName). A client whose RName matches the file's owner property (or is a member of the owner's create listsee section 3.6) is permitted to change the file's access control lists.
Each Alpine system has a special Alpine wheels'' access control list whose RName is part of the file system's global state. Membership in this group permits unlimited access to files and other objects, regardless of their access control lists; additionally, there are some administrative and debugging functions that can be invoked only by Alpine wheels. To prevent accidental use of this dangerous capability, the client's membership in the Alpine wheels group is not noticed until explicitly enabled. Wheel membership is enabled or disabled by calling AlpineTransaction.AssertAlpineWheel.
3.6. Owner data base
Consumption of disk space in an Alpine server is controlled on the basis of information in an owner data base. An owner is simply a name (usually but not necessarily an RName) associated with a fixed set of information: a disk space quota, present consumption, root file ID (see below), and two access control lists specifying who may create files for that owner (i.e., allocate disk space against the owner's quota) and who may change the owner information itself.
For the purpose of disk space accounting, several volumes on a given server may be grouped into a single volume group, which has associated with it a single owner data base. Permission to allocate space on behalf of an owner applies to any volume in the group. Additionally, when creating a file, a client may identify a volume group rather than an individual volume, leaving the choice of volume up to the server; this is discussed in the AlpineFile section.
Also associated with each owner is a root file ID, which is intended for client use in identifying a directory or other data structure for locating all files belonging to that owner.
4. AlpineFile interface
The AlpineFile interface (and AlpFile in the AlpineUser package) includes all operations on files. This section should be read in conjunction with AlpineFile.mesa itself. Before calling any of these procedures, one must first start a transaction, using operations that will be presented in section 5.
4.1. File system organization
The protocol described in section 3 binds a client to a specific Alpine instance, identified by an RName. Each Alpine instance consists of a log, used to implement atomic transactions, and some number of volumes, organized into volume groups. A volume is analogous to a Pilot logical volume'': in its quiescent state it is a self-contained file system. The set of volumes composing an Alpine instance may change dynamically as volumes are mounted and dismounted, though the rate of change is expected to be slow.
A file is a sequence of fixed-size pages stored in a single volume. A file is identified by a FileID, and its pages are numbered consecutively from zero. Associated with a file is a fixed set of properties that may be manipulated by separate operations.
4.2. Open files
A client desiring to gain access to a file must first Open it. Opening a file serves two purposes. First, it defines a point at which file access control and whole-file locking are done. Second, it associates a brief handle, the OpenFileID, with a considerable amount of state that is thereafter kept by the server and need not be supplied in each call. That is, an OpenFileID represents a single conversation's access to a single file under a single transaction.
Note that Open requires the client to supply a UniversalFile, which contains both a VolumeID and a FileID, even though in current Pilot a file is uniquely identified by a FileID alone. This is because in the forthcoming Cedar Nucleus (as well as in future versions of Pilot), FileIDs will not be universally unique but only unique relative to a particular VolumeID. In anticipation of this, Alpine now requires that the client present the correct volume for the file, and raises Unknown[fileID] if it is incorrect.
A FileID consists of both the file identifier itself, which is permanently unique, and some location information, which is treated as a hint and is subject to change during the lifetime of the file. Open returns a FileID whose location hint may differ from the hint in the FileID that was presented. If the FileIDs do differ, the client is encouraged to retain the revised FileID in its own data structures so that subsequent Open calls may be performed more efficiently.
The entire file is locked in the mode specified by the lock argument. If the file cannot be locked in that mode, Open either waits until the lock can be set or raises LockFailed[conflict] immediately, depending on lock.ifConflict. (The semantics of LockModes are discussed in section 5.2.)
<The following option is not yet implemented. Ordinarily, modifications to the file are protected against transaction aborts, crashes, and media failure by the log mechanism; logging may be disabled by passing recoveryOption=noLog. If a file was previously updated with recoveryOption=noLog and that transaction was aborted, Open will raise the signal PossiblyDamaged; this signal may be RESUMEd by the client. This feature exists primarily for the benefit of a (future) file replication facility: file replication does not require the protection provided by the log mechanism, and should not incur its cost. It may prove useful to have files with this behavior for other reasons as well. There may be more restrictions on use of noLog; it may require an enabling file property.>
4.3. File creation
The Create procedure creates and opens a new file. The initial size of the file is specified by initialSize; clients are encouraged to create files of the required size rather than growing them piecemeal. The space is charged against owner in the owner data base; the client is required to be a member of owner's create access control list, and the allocation must not exceed owner's quota. Note that allocation is consumed at the moment a file is created; but when a file is deleted, the allocation is not credited with the freed pages until transaction commit time.
The file is created with a Pilot FileType of type <presumably restricted by Alpine; will this property be interesting with the Cedar Nucleus?>, and with Pilot attributes immutable=FALSE and temporary=FALSE (support for immutable files is being eliminated from Pilot; and since Alpine is transaction-based, the Pilot temporary attribute is not of any use). All file properties are set to default values; they may be changed by means of the procedures described below.
4.4. Operations on open files
The procedures that access the volatile state associated with an OpenFileID are fully documented in the interface, as are the Delete and SetSize operations.
When a transaction terminates, all files opened under that transaction are ordinarily closed automatically. However, if there is a continuation transaction (i.e., continue=TRUE and outcome=commit in the call to AlpineTransaction.Finish) then the files are left open.
The ReadPages and WritePages procedures access a file's data in units of PageRunssequences of consecutive complete pages. The PageBuffer arguments, which designate blocks of memory in the caller's virtual address space, deserve some explanation. The types VALUEPageBuffer and RESULTPageBuffer are equivalent from Mesa's point of view, but are treated differently by RPC (this treatment is invoked by the special prefixes VALUE and RESULT). A VALUEPageBuffer is dereferenced at the time of the call; i.e., the contents of the array designated by the DESCRIPTOR are sent. This is consistent with normal RPC call semantics. However, a RESULTPageBuffer designates a place where a result is to be put at return time. No data is passed at call time, but a block of data of the correct size is returned by the procedure and stored in the designated array.
Pages are ordinarily locked automatically as a side-effect of calling ReadPages and WritePages, and unlocked by terminating the transaction (see section 5.2 for details). However, it is occasionally useful to lock pages in advance of doing I/O to them, in order to be assured that locking will be possible; this may be accomplished by calling LockPages. UnlockPages may be called to remove a read lock prior to the end of the transaction; it is the client's responsibility to ensure consistency by avoiding any subsequent updates that depend on the data formerly protected by the read lock. Update and write locks cannot be removed by UnlockPages; attempts to do so are ignored without error indication.
4.5. File properties
Associated with each file is a fixed set of file properties (see AlpineEnvironment.Property and PropertyValuePair). These include the underlying Pilot type property and several additional properties that are expected to be generally useful: access control lists, byte length, create time, string name, and version.
File properties are read and written under a transaction, just like file data; properties may be individually locked or be locked implicitly by a file lock. In the current version of Alpine, all properties with the exception of the version are treated together with respect to locking. So, for example, writing any property will lock out reads of any property of the same file by another transaction.
The byte length is a LONG INTEGER associated with the file. Alpine does not enforce any properties of a file's byte length; in particular, the byte length has no inherent relation to the file's size (number of pages) or to the high water mark property described below.
The create time is a System.GreenwichMeanTime, which by convention identifies the time at which the information in the file was created. Alpine does not enforce any properties of the create time. We discourage the use of create times to identify files; we provide versions (below) for that purpose.
A file may contain pages whose contents are undefined. For example, when a file is created, all its pages have undefined contents; and when SetSize is used to increase the length of a file, the newly-allocated pages have undefined contents. The boundary between the defined and undefined portions of a file is called its high water mark. (Precisely, the high water mark is the number of pages of defined contents, or equivalently the page number of the first page with undefined contents.)
Alpine maintains the high water mark so as to gain an important performance advantage: page writes beyond the high water mark are performed immediately rather than being deferred until the transaction is committed. This substantially reduces the number of disk transfers that the server must perform; it is intended to improve the performance of sequential bulk transfers. This optimization occurs only when the entire file has been opened with an update or write lock.
The high water mark is advanced automatically by page writes beyond the previous high water mark; it may also be adjusted by calling WriteProperties. Decreasing the high water mark is a way of declaring that some portion (or all) of a file's contents are no longer interesting. Note that changing the high water mark, like all destructive update operations, is deferred until the end of the transaction. So for a decrease in the high water mark to have the performance benefit described above, the transaction must first be committed.
Each file has an owner, which is a single OwnerName, and read and modify access control lists, each of which is a list of RNames. These were described in section 3.6.
The string name of a file consists of at most maxStringNameChars characters. Alpine does not enforce any properties of the string name.
The version of a file is a LONG INTEGER whose value is the number of committed transactions that have modified the file in any way (via WritePages, SetSize, WriteProperty, etc.) A single transaction may also increment a file's version by a specified amount. The version property is provided to allow remotely stored copies of files to be validated; and multiple incrementing of the version property is provided to allow out-of-date replicas to be updated in a single transaction.
As mentioned above, the version property is locked separately from the others. Committing a transaction that has updated a file sets a write lock on that file's version property in order to increment it.
Reading the version property with ReadProperties locks it in read mode. UnlockVersion releases a read lock on the version property. This allows the file to be updated by another transaction without conflict over the version. Note that if the current transaction has locked the entire file in read mode, that lock covers the version property, so it is useless to call UnlockVersion. For higher concurrency, the client must lock the file in intendRead, intendUpdate, or intendWrite mode and use UnlockVersion to unlock the version property after reading it.
5. Transactions and locks
All Alpine actions are carried out under transactions that ensure atomicity and consistency of updates in the face of concurrent requests from multiple clients, crashes of servers or clients, etc. The AlpineTransaction interface provides client access to the transaction facilities.
Briefly, a client first requests that some Alpine instance create a transaction; that instance subsequently serves as the coordinator for that transaction. If actions are to be carried out in other Alpine instances under the same transaction, the client must cause those instances to become workers for that transaction.
Reads and writes are performed under the transaction. The transaction machinery ensures that other clients will not see the state of any of the Alpine instances in a partially-updated (inconsistent) state; this is done by setting locks on files or parts of files, as discussed in section 5.2.
Eventually the client either commits or aborts the transaction. Committing causes all writes occurring under the transaction to be made permanently in the state of the file system and to be made visible to other transactions. Aborting causes all the writes to be abandoned, and makes the permanent state be as if those writes had never occurred. A transaction can also be aborted by a server or client crash or by a detected deadlock among transactions attempting to lock data in conflicting ways.
5.1. Transaction management
A transaction is begun by a client calling AlpineTransaction.Create on some Alpine instance. This returns a permanently unique transaction ID that may be passed to other operations. (Programmers should note that the TransID is a rather bulky object, which is why very few procedures take TransID arguments directly.)
A TransID may be treated as a capability, since it contains enough bits of unpredictable information to make it extremely difficult to forge. Therefore a transaction is shared among clients or private to the initiator according to whether or not the initiator hands the TransID to other clients. Alpine enforces no restrictions on who may present a particular TransID.
In the AlpineUser package, AlpTransaction.Create returns a new AlpTransaction.Handle referencing an object that holds the TransID. This Handle, being a Cedar REF, is of course valid only on the machine that created it. If clients on other machines need to participate in the same transaction, the caller must extract the TransID from the object and pass it to those clients. They may then call AlpTransaction.CreateWithTransID, which creates a local AlpTransaction.Handle for the same transaction.
The client must also call CreateWorker on each Alpine instance that is to participate in the transaction prior to issuing any other requests to that instance. This call informs the worker of the identity of the transaction's coordinator.
It is standard practice to create a transaction on the Alpine instance containing the data to be accessed, or on one such instance if multiple servers are to be accessed. To accomodate this normal case, Create automatically starts a worker on the same Alpine instance as the coordinator unless instructed not to. A client on a workstation should not normally use its local Alpine instance to coordinate transactions involving multiple servers, since a crashed coordinator can tie up resources on other file systems. It is perfectly ok for a workstation's Alpine instance to coordinate its own transactions, or to act as a worker in a multiple file system transaction, since in these cases the only resources that it can tie up in a crash are its own.
When the client has completed performing the desired operations, it should call Finish to terminate the transaction. Normally it should specify requestedOutcome=commit; but to undo all the operations performed under the transaction it may specify requestedOutcome=abort. The actual outcome may differ from the requested one if Finish has already been called or if the transaction has been aborted for some other reason (server crash, lock deadlock, etc.)
Long-running transactions are discouraged. A transaction should encompass the minimum number of operations required to maintain the consistency of the higher-level abstraction represented by the data accessed during the transaction. Transactions that run for long periods of time tie up excessive resources and are likely to be aborted spontaneously, thereby wasting large amounts of work.
There is an option to terminate a transaction without losing the state that has been established (open files and locks). If Finish is called with continue=TRUE and the transaction finishes with outcome=commit, a new transaction is created to which the open files and locks are transferred. This is a convenient way to break up a long sequence of operations that don't need to be treated together as a single atomic unit. In this case, AlpineTransaction.Finish returns the TransID for the new transaction, and the old TransID becomes invalid. AlpTransaction.Finish assigns the new TransID to the existing Handle, so the Handle remains valid.
5.2. Semantics of locks
This is a brief functional overview of Alpine file locks; for background material and further detail, see Alpine lock manager concepts''.
All locking is performed implicitly, as a side-effect of read and write operations on files and other objects. Consequently, there is no public interface for dealing explicitly with locks; however, most operations provide optional means for a certain amount of client control over locks. (Local clients, i.e., ones located on the same machine as the Alpine instance, can call the lock manager directly for the purpose of setting higher-level logical'' locks.)
When an object such as a file page is accessed, it is first locked in some mode. A read access must lock the object in read, update, or write mode, while a modify access must lock the object in update or write mode. Locking in update or write mode during a read access is sometimes useful when the client knows that it will later modify the object during the same transaction and wishes to be assured that locking will be possible. The distinction between update and write modes will be explained shortly.
Once a lock is set on an object under some transaction, it will prevent certain types of access by any other transaction until the transaction holding the lock has terminated. Note that locks arbitrate access only among different transactions; operations performed within the transaction that set the locks proceed unhindered. (This is true even where multiple client processes, conversations, etc., are involved. In particular, concurrent calls to Alpine procedures will be performed in an arbitrary order if they are all part of the same transaction.)
The interaction between locks is defined by the function Compat[r, e], where r is a lock mode being requested and e is the mode of an existing lock on the same object by another transaction; it returns TRUE if r may be set immediately and FALSE if r may not be set until e has been removed.
existing:
requested: read update write
read TRUE TRUE FALSE
update TRUE FALSE FALSE
write FALSE FALSE FALSE
The interaction between read and write locks is conventional: multiple readers or at most one writer (but not both) can coexist. An update lock is effectively a read lock at the time it is set but is converted automatically into a write lock at the time the transaction is committed (which is when any modifications to the locked object are logically performed); thus an update can proceed in parallel with any ongoing reads, but committing the update may require waiting until all read locks have been removed.
A transaction may upgrade one of its own locks by converting it to a stronger lock, with relative strength given by read < update < write. This is performed automatically, for example, when an object is read and later written during a single transaction. Of course, this upgrade may be blocked by locks set by other transactions.
Files are locked at two levels: the entire file may be locked in some mode, or pages and properties of the file may be individually locked. Locking the entire file substantially reduces the amount of work the server must do, since individual locks then need not be applied during each operation on the file; this is the appropriate style of locking for bulk file transfers and transactions involving private files. On the other hand, locking individual pages is required for shared data bases to maintain adequate concurrency.
The whole-file lock modes are read, update, and write, just as for other objects; these locks effectively cover'' individual operations whose modes are no stronger than the whole-file lock.
When pages of a file are to be locked individually as operations are performed, the entire file is locked in a weaker intention mode that specifies the strongest page lock expected; these modes are intendRead, intendUpdate, and intendWrite. For all x,y, Compat[intendx, intendy]=TRUE, but Compat[intendx, y]=Compat[x, y] and Compat[x, intendy]=Compat[x, y]. This enables detection of potential conflicts between page and file locks at the time the file is locked.
If an individual operation would require a lock stronger than the one implied by an intention-mode file lock, the file lock is automatically upgraded; this upgrade can of course be blocked by file locks set by other transactions.
Additionally, there are combination modes readIntendUpdate and readIntendWrite, which immediately lock the entire file in read mode with intention to perform individual operations requiring update or write locks. Compat[readIntendUpdate, readIntendUpdate]=FALSE, because it is not useful for it to be TRUE (there would be a guaranteed deadlock at commit time).
Most operations have an optional lockOption argument, which permits the client to specify the lock mode to be used and the action to be taken if a conflict occurs; the default values of these arguments are appropriate for most applications. If lockOption.mode is too weak for the operation, it is ignored and the default used instead.
If the lock cannot immediately be set and lockOption.ifConflict=fail, the operation immediately raises the error LockFailed[conflict]; otherwise the operation waits until the lock can be set, raising LockFailed[timeout] if a timeout occurs. If there is a detected deadlock among two or more transactions attempting to acquire the same set of locks, one of the transactions is automatically aborted, and its operation fails with the error Unknown[transID].
6. Other interfaces
AlpineOwner contains the operations for accessing a volume group's owner data base, whose purpose was described in section 3.6. See the interface source file for complete information.
AlpineVolume contains procedures for dealing with the set of volumes and volume groups accessible to an Alpine instance. Many aspects of this interface are speculative, and implementation is incomplete at this time.
The AlpineUser package exports all the AlpineOwner and AlpineVolume operations through the AlpTransaction interface. Since these operations take an AlpTransaction.Handle as their first argument, they may be invoked with object notation.
References
[Alpine]
Alpine documentation is kept on-line in the [Indigo]<Alpine>Doc> directory. In the file names, *'' stands for a number; successive releases of a document are assigned the next higher number.
Alpine file server overview'', AlpineOverview.press
Alpine lock manager concepts'', LockConcepts*.bravo
FileStore interface internals'', FileStoreInternal*.bravo
[Lupine, 1982]
Bruce Nelson and Andrew Birrell, Lupine user's guidean introduction to remote procedure calls in Cedar'', [Indigo]<Cedar>Documentation>LupineUsersGuide.press, July 1982.
[Needham & Schroeder, 1978]
Roger M. Needham and Michael D. Schroeder, Using encryption for authentication in large networks of computers'', Communications of the ACM, vol. 21 no. 12, December 1978.
Change history
Version 9; June 7, 1983 5:36 pm. Convert to Tioga format. Remove all excerpts from the public interfaces; the interfaces themselves are now the primary documentation. Bring RPC description up-to-date. Document the AlpineUser package.
Version 8; October 13, 1981 5:29 PM. Add AlpineFile.Close, LockPages, UnlockPages. Finishing a transaction with commitAndContinue now returns a new TransID rather than permitting additional operations under the old transaction. Substantially simplify interface to RPC runtime machinery.
Version 7; September 21, 1981 8:38 AM. Further refine to conform with RPC design. Add high water mark mechanism. Delete the individual property read/write procedures. Document semantics of locks. Change LockMode to LockOption. Change to read/write owner entry in same style as file properties.
Version 6; September 14, 1981 8:46 AM. Change style of this memo to make it serve better as client programmer's documentation for the FileStore public interface; remove internal details (implementation strategies and the like) to a separate memo, FileStore interface internals''. Split the former FileStore interface into four pieces: AlpineEnvironment, AlpineAccess, AlpineFile, and AlpineTransaction. Bring conversation initiation and authentication into conformity with the current RPC design. Make a first cut at specifying the signals.
Version 5; August 25, 1981 12:41 PM. Make interface handle-oriented, since RPC requires it; see Handle, Create, Destroy. A Handle corresponds to a client talking to a server, not to a volume or a machine; some procedures take a volume (or volume group) parameter. Make the identification FileStore = log, and introduce a separate notion of logical disk volume and volume group. Add no-logging option to Open/CreateFile; this takes the place of noTransaction, since it handles the one situation in which we can see that clients will want to give up recovery to reduce the size of the log (large-scale replication of files, where the home copy is kept logged and other copies are unlogged.) OpenFileID -> OpenFileID, AccessListID -> AccessList (a list of RNames.) Add section on owner database (CreateOwner, ... , DestroyOwner), and expanded file properties section to include all properties we now plan to support.
Version 4; May 29, 1981 5:36 PM. Introduced notion of file owner, who is charged/credited for pages used/released in Create/DeleteFile and SetLength.
Version 3; May 15, 1981 11:02 AM. Introduced types OpenFileID, ClientID, AccessListID, ... . Added noTransaction. More detail on file opening and creation; support file create with and without an externally supplied ID (used to be only with.) Length -> ByteLength throughout. List-oriented file property operations. Much more work required on property interface, including locking issues.
Version 2; March 12, 1981 11:22 AM. Added more detail to the Files'' section, including procedures to manipulate file attributes. Made lock modes consistent with LockDesign0. Made minor changes to Transaction section: RegisterWorker takes a restart ID'' to allow unilateral abort of worker; FinishTransaction and FinishWorker take a requiredOutcome that may be commitAndContinue or commitAndTerminate; expanded comments on Transaction section operations.
Version 1; February 25, 1981 3:54 PM. Lock stuff moved out (to LockConcepts0.bravo, LockDesign0.bravo.) Sierra -> Alpine.
Version 0; February 19, 1981 3:16 PM. Interface name changed from BFS''. We are attempting to erase the distinction between the regular interface and on the wire''. The volatile structures Coordinator'', Worker'', and File'' have disappeared. The type LockID has been defined, following Ed Taft's suggestion.
Unfinished business
Log/recovery interface.
Cache registration (at what level should this be done?)
Lock interface (no conceptual difficulty, details can be deferred to implementation time.)
ContentID property?
Interfaces to backup and archive systems.
Indefinitely deferred business
Validation of remote page caches (through a history database that is coordinated with file page updates.)
Transaction save points?