Tenex, a Paged Time Sharing System for the PDP-10
Communications of the ACM, March 1972, Volume 15, Number 3
Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy, and Raymond S. Tomlinson, Bolt Beranek and Newman Inc.
TENEX is a new time sharing system implemented on a DEC PDP-10 augmented by special paging hardware developed at BBN. This report specifies a set of goals which are important for any time sharing system. It describes how the TENEX design and implementation achieve these goals. These include specifications for a powerful multiprocess large memory virtual machine, intimate terminal interaction, comprehensive uniform file and I/O capabilities, and clean flexible system structure. Although the implementation described here required some compromise to achieve a system operational within six months of hardware checkout, TENEX has met its major goals and provided reliable service at several sites and through the ARPA network.
Storage organization and management in TENEX
Daniel L. Murphy
AFIPS ’72 (Fall, part I) Proceedings of the December 5-7, 1972, fall joint computer conference, part I
Pages 23-32, Anaheim, California — December 05 – 07, 1972
The first of these two papers discusses TENEX; much of the paper is not about file systems, but there are about a page and a half about the TENEX file system. The second paper goes into greater detail about storage – including the file system – for TENEX. I have picked these papers for several reasons:
- They demonstrate the impact of the MULTICS work on the systems that follow (certainly beyond the obvious UNIX work).
- They introduce the concept of virtual integration with the file system
- They introduce the concept of copy-on-write
- They show the fundamental drive to maintain backwards compatibility
- They introduce the concept of a suffix (or extension) as a means of identifying the purpose of a file
- They delve into the details of open file state management
A TENEX file has a compound name structure:
A powerful and versatile directory and file naming facility is provided in which a particular file is identified by a fixed-depth path which includes device, directory name, file name, extension, and version.
The fixed-depth path is a limitation the TENEX developers chose to implement for backwards compatibility with existing PDP-10 programs, an early example of how application compatibility is often a critical concern for operating systems development. The authors do note they are considering expanding upon this to make it arbitrary depth – a feature of MULTICS.
Both papers also discuss the Job concept, the idea of a set of related processes. The implication is that processes within a single job can share resources, thus providing more of that “balance between sharing and isolation” that operating systems have to handle. When a file is opened successfully, a Job File Number is created in a table. That encapsulates the information about how to find the given file and instead uses an index value – in other words, a file descriptor or file handle. “Once the initial association of JFN and file has been established, the JFN is used for all ensuing operations on the file, including sequential reading and writing, opening, closing, etc.”
TENEX then allows random access to the file by combining the JFN with an index identifying the desired element. The authors point out that this is more flexible than previous systems in which the file was not random access.
This becomes flexible when describing the page map for a given process. The Process Map points from an entry in the virtual address space to a corresponding JFN and index (offset). Thus, the contents of that page can be retrieved on demand from the underlying file system.
None of this should look particularly surprising to anyone familiar with modern operating systems, of course. This just happens to be part of the path to get to where we are today. The papers actually go into greater detail about the details here, including access control, but that isn’t germane to my file systems focus.
Since the file path names identify files over the domain of all jobs in the system, it is evident that our naming and mapping procedures readily provide a means for sharing storage. Using the appropriate path names (including legality checks), processes in two or more different jobs can identify the same file, and each can obtain a JFN for it. Nothing in the mapping procedures specified above requires that either process be aware of the other’s access, and so each process constructs an identifier and places it in its process map (Figure 4).
Sharing at this level would be particularly important because of the limited address space and desire to share code – the papers discuss this, and point out the benefits of this form of sharing.
This leads to their inclusion of copy-on-write. “One other important TENEX feature which facilitates sharing is a type of page access called copy-on-write. To our knowledge, this facility was first developed and used on the BBN-LISP system for the
XDS-9407.” Thus, while not original to TENEX, this is a logical extension beyond what MULTICS had described. Copy-on-write is a mainstay of both modern operating systems and some file systems.
Interestingly, TENEX seems to implement a rudimentary page cache as well:
To implement the file sequential monitor calls (e.g., byte-in, byte-out) the monitor maintains a number of “window” pages in a separate map invisible to the user process. For each file with sequential operations in progress, the monitor maps the file page which is to receive or provide the next byte. Each call from the user causes one or more bytes to be loaded from or stored into this page, and a count updated to determine if a new page should be mapped. Movement through the file is accomplished by mapping successive pages, and the sequential access module does not have to be aware of the physical device on which the page resides nor interface with I/O driver modules to read or write it. This modularity is very satisfying from an operating system design point of view.
Thus, byte level access to block level devices is managed via this window page mechanism. The files are not strictly memory mapped, though, so this is more like a buffer cache than a page cache.
They also use the file system to implement inter-process communications – a form of file-backed shared memory.
Page management is tightly tied to this implementation as well, though the description involves what we would likely consider the memory management unit and page fault handling logic as well as the page to file/offset mapping necessary to provide the system’s demand paging.
Two other interesting aspects of their file systems model includes a pair of extra mapping layers: one for mapping from logical storage address to physical storage location, and the other mapping from multiple distinct page references to a single storage block.
The underlying rationale here is that this permits relocating the storage to different locations, typically from higher speed storage (when warm/hot) to slower speed storage (when cold).
This mechanism doesn’t involve changing the actual description of the storage and instead moves to a logical storage addressing model. It was interesting to me to see this level of indirection added in such an early system, but clearly the mismatch in speeds between various types of storage dictated the importance of this scheme. Once again, it is interesting to see how little the problems we face have actually changed.
The data sharing model also uses an extra level of indirection. I’m familiar with this model from my own work in Windows, where shared memory is indirectly mapped in a similar fashion. That this mechanism was around in the early 1970s is once again a reminder of how little operating systems have fundamentally changed.
There are many aspects of this paper that I have glossed over, in no small part because they don’t really apply to modern systems – we don’t have to worry about drum memories, for example, no more than we need to worry about punch card readers. These two papers, however, clearly lay out a deeper realization of the file system than I have seen in prior work.
TENEX differed from MULTICS in a number of ways and the two systems remained competitors for many years. TENEX ultimately would become TOPS-20 and in turn be supported by Digital Equipment Corporation. It was an important part of the early (pre-VAX) ARPANET and survived for many years as a viable system.
If you would like to read more about this, I’d recommend Dan Murphy’s Origins and Development of TOPS-20 post. It provides further fascinating background on how TENEX evolved and how systems evolved. I leave you with the final words from that post:
Although this book is about DEC’s 36-bit architecture, it is clear now that hardware CPU architectures are of declining importance in shaping software. For a long time, instruction set architectures drove the creation of new software. A new architecture would be introduced, and new software systems would be written for it. The 36-bit architecture was large in comparison to most other systems which permitted interactive use. It was also lower in cost than most other systems of that size. These factors were important in the creation of the kind of software for which the architecture is known.
Now, new architectures are coming along with increasing frequency, but they are simply “slid in” under the software. The software systems are far too large to be rewritten each time, and a system which cannot adapt to new architectures will eventually suffer declining interest and loss of competitive hardware platforms. TOPS-20 didn’t pass that test, although it did contribute a number of ideas to the technology of interactive systems. How far these ideas ultimately spread is a story yet to be told.
There is considerable insight in this for me, particularly the admonishment “software systems are far too large to be rewritten each time” as it resonates with (one of) my own research directions.