Home » File Systems

Category Archives: File Systems

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 204 other subscribers
April 2024
S M T W T F S
 123456
78910111213
14151617181920
21222324252627
282930  

Challenges of Capturing System Activity

A key aspect of the work I am doing for Indaleko is to “capture system activity” so that it can be used to form “activity contexts” that can then be used to inform the process of finding relevant information. As part of that, I have been working through the work of Daniela Vianna. While I have high-level descriptions of the information she collected and used, I need to reconstruct this. She collects data from a variety of sources. The most common source of information comes from web APIs to services such as Google and Facebook. In addition, she also uses file system activity information.

Since my background is file systems, I decided to start on the file system activity front first. Given that I’ve been working with Windows for three decades now, I decided to leverage my understanding of Windows file systems to collect such information. One nice feature of the NTFS file system on Windows is its support for a form of activity log known as the “USN Journal.” Of course, one of my handicaps is that I am used to using the native operating system API, not the libraries that are implemented on top of it. This is because when building file systems on Windows I have always been interested in testing the full kernel file systems interface. While there are a few specific features that cannot be exercised with just applications, there are still a number of interfaces that cannot be tested using the typical Win32 API that can be tested using the native API. In recent years the number of features that have been hidden from the Win32 API has continued to decrease, which has diminished the need to use the native API. I just haven’t had any strong need to learn the Win32 API – why start now?

I decided the model I want to use is a service that pulls data from the USN journal and converts it into a format suitable for storing in a MongoDB database. I decided to go with Mongo because that is what Vianna used for her work. The choice at this point is somewhat arbitrary but MongoDB makes sense because it tends to work well with semi-structured data, which is what I will be handling.

Similarly, I decided that I’d write my service for pulling USN Journal data from the NTFS file system(s) in C# since I have written some C# in the past, it makes doing some of the higher level tasks I have much easier, and is well-supported on Windows. I have made my repository public though I may restructure and/or rename it at some point (currently I call it CSharpToNativeTest because I was trying to invoke the native API as unmanaged code from C#). The most common approach to this is to utilize a specific mechanism (the “PInvoke” mechanism) but after a bit of trial-and-error I decided I wanted something that would be easier for me to debug, so instead of pulling the native routine directly from ntdll.dll I load it from my own DLL (written in C) and that then invokes the real native call. This allows me to see how data is being marshaled and delivered to the C language wrapper. I also tried to make the native API “more C# friendly.” I am sure it could be more efficient, but I wanted to support a model that I could extend and hopefully it will be easier to make it more efficient should that prove necessary.

One thing I did was to script the conversion of all the status values in ntstatus.h into a big C# enum type. The benefit of this is that when debugging I can automatically see the mnemonic name of the status code as well as its numeric value. I then decided to provide the layer needed to map the various volume names used on Windows around, with device names, device IDs, and symbolic links (drive letters) that can be mapped. While I have not yet added it, I wrote things so that it should be fairly straight-forward to add a background thread which wakes up when devices arrive or disappear. As I have noted before “naming is hard.” This is just one more example of the flexibility and challenges with aliasing and naming.

Finally, I turned my attention to the USN journal. I found some packages for decoding USN journal entries; most were written to parse the data from the drive, while a few managed dynamic access. Since I want this to be a service that monitors the USN journal and keeps adding information into the database, I decided to write C# code to use the API for retrieving that information. At this point, what I have is the ability to scan all the volumes on the machine – even if they do not have drive letters – and query them to see if they support a USN journal. I do this properly – I query the file system attributes (using the NtQueryVolumeInformationFile native API) and check if the bit showing USN journal support is marked. I do not use the file system name, an approach I’ve always considered to be a hack, especially since I have been in the habit of writing file systems that support NTFS features, including named data streams, extended attributes, and object IDs. In fact, the ReFS file system on Windows also supports USN journals, so I’m not just being my usual pedantic developer self in this instance.

At this point, I am able to identify volumes that support USN journals, open them and find out if USN is turned on (it is by default on the system volume, which is almost always the “C:” drive, though I enjoy watching things break when I configure a system to use some other drive letter.) I then extract the information and convert it to in-memory records. At the moment I just have it wait a few seconds and pull the newest records, but my plan is to evolve this into a service that I can run and it can keep pulling data and pushing it into my MongoDB instance.

At this point, I realized I do not really know that much about MongoDB so I have decided to start learning a bit more about it. Of course, I don’t want to be a MongoDB expert, so I also have been looking more carefully at Daniela Vianna’s work, trying to figure out what her data might have looked like and think about how I’m going to merge what she did into what I am doing. This is actually exciting because it means I’m starting to think of what we can do with this additional information.

This afternoon I had a great conversation with one of my PhD supervisors about this and she was making a couple of suggestions about ways to consume this data. That she was suggesting things I’d also added to my list was encouraging. What are we thinking:

  • We can consider using “learned index structures” as we begin to build up data sets.
  • We can use techniques such as Google BERT to facilitate dealing with the API data that Vianna’s work used. I pointed out that the challenges of APIs that Vianna pointed out are similar to languages: they have meaning and those meanings can be expressed in multiple ways.
  • The need for being able to efficiently find things is growing rapidly. She was explaining some work that indicates our rate of data growth is outstripping our silicon capabilities. In other words, there is a point at which “brute force search” becomes impractical. I liked this because it suggests what we are seeing with our own personal data is a leading indicator of the larger problem. This idea of storing the meta-data independent of the data is a natural one in a world where the raw information is too abundant for us to just go looking for an item of interest.

So, my work continues, mostly mundane and boring, but there are some useful observations even at this early stage. Now to figure out what I want the data in my database to look like and start storing information there. Then I can go figure out what I did right, what I did wrong, and how to improve things.

Aside: one interesting aspect of the BERT work was their discussion of “transducers.” This reminded me of Gifford’s Semantic File System work, where he used transducers to suck out semantic information from existing files.

Brainiattic: Remember more with your own Metaverse enhanced brain attic

Connecting devices and human cognition

I recently described the idea of “activity context” and suggested that providing this new type of information about data (meta-data) to applications would permit improve important tasks such as finding. My examining committee challenged me to think about what I would do if my proposed service – Indaleko – already existed today.

This is the second idea that I decided to propose on my blog. My goal is to find how activity context can be used to provide enhanced functionality. My first idea was fairly mundane: how can we improve the “file browsing” experience in a fashion that focuses on content and similarity by combining prior work with the additional insight provided by activity context.

My initial motivation for this second idea was motivated by my mental image of a personal library but I note that there’s a more general model here: displaying digital objects as something familiar. When I recently described this library instantiation of my brain attic the person said “but I don’t think of digital objects as being big enough to be books.” To address this point: I agree, another person’s mental model for how they want to represent digital data in a virtual world need not match my model. That’s one of the benefits of virtual worlds – we can represent things in forms that are not constrained by what things must be in the real world.

In my recent post about file browsers I discussed Focus, an alternative “table top” browser for making data accessible. One reason I liked Focus is that the authors observed how hierarchical organization does not work in this interface. They also show how the interface is useful and thus it is a concrete argument as to at least one limitation of the hierarchical file/folder browser model. Another important aspect of the Focus work was their observation that a benefit of the table top interface is it permits different users to organize information in their own way. A benefit of a virtual “library” is that the same data can be presented to different users in ways that are comfortable to them.

Of course, the “Metaverse” is still an emerging set of ideas. In a recent article about Second Life Philip Rosedale points out that existing advertising driven models don’t work well. This begs the question – what does work well?

My idea is that by having a richer set of environmental information available, it will be easier to construct virtual models that we can use to find information. Vannevar Bush had Memex, his extended memory tool. This idea turns out to be surprisingly ancient in origin, from a time before printing when most information was remembered. I was discussing this with a fellow researcher and he suggested this is like Sherlock Holmes’ Mind Palace. This led me to the model of a “brain attic” and I realized that this is similar to my model of a “personal virtual library.”

The Sherlock Holmes article has a brilliant quotation from Maria Konnikova: “The key insight from the brain attic is that you’re only going to be able to remember something, and you can only really say you know it, if you can access it when you need it,”

This resonates with my goal of improving finding, because improving finding improves access when you need it.

Thus, I decided to call this mental model “Braniattic.” It is certainly more general than my original mental model of a “personal virtual library,” yet I am also permitted to have my mental model of my pertinent digital objects being projected as books. I could then ask my personal digital librarian to show me works related to specific musical bands, or particular weather. As our virtual worlds become more capable – more like the holodeck of Star Trek – I can envision having control of the ambient room temperature and even the production of familiar smells. While our smart thermostats are now capturing the ambient room temperature and humidity level and we can query online sources for external temperatures, we don’t actively use that information to inform our finding activities, despite the reality is that human brains do recall such things; “it was cold out,” “I was listening to Beethovan,” or “I was sick that day.”

Thus, having additional contextual information can be used at least to improve finding by enabling your “brain attic.” I suspect that, once activity context is available we will find additional ways to use it in constructing some of our personal metaverse environments.

Using Focus, Relationship, Breadcrumbs, and Trails for Success in Finding

As I mentioned in my last post, I am considering how to add activity context as a system service that can be useful in improving findings. Last month (December 2021) my examination committee asked me to consider a useful question: “If this service already existed what would you build using it?”

The challenge in answering this question was not finding examples, but rather finding examples that fit into the “this is a systems problem” box that I had been thinking about while framing my research proposal. It has now been a month and I realized at some point that I do not need to constrain myself to systems. From that, I was able to pull a number of examples that I had considered while writing my thesis proposal.

The first of this is likely what I would consider the closest to being “systems related.” This hearkens back to the original motivation for my research direction: I was taking Dr. David Joyner’s “Human-Computer Interaction” course at Georgia Tech and at one point he used the “file/folder” metaphor as an example of HCI. I had been wrestling with the problem of scope and finding and this simple presentation made it clear why we were not escaping the file/folder metaphor – it has been “good enough” for decades.

More recently, I have been working on figuring out better ways to encourage finding, and that is the original motivation for my thesis proposal. The key idea of “activity context” has potentially broader usage beyond building better search tools.

In my research I have learned that humans do not like to search unless they have no other option. Instead, they prefer to navigate. The research literature says that this is because searching creates more cognitive load for the human user than navigation does. I think of this as meaning that people prefer to be told where to go rather than being given a list of possible options.

Several years ago (pre-pandemic) Ashish Nair came and worked with us for nine weeks one summer. I worked with him to look at building tools to take existing file data across multiple distinct storage domains and present them based upon commonality. By clustering files according to both their meta-data and simply extracted semantic context, he was able to modify an existing graph data visualizer to permit browsing files based on those relationships, regardless of where they were actually stored. While simple, this demonstration has stuck with me.

Ashish Nair (Systopia Intern) worked with us to build an interesting file browser using a graph data visualizer.

Thus, pushed to think of ways in which I would use Indaleko, my proposed activity context system, it occurred to me that using activity context to cluster related objects would be a natural way to exploit this information. This is also something easy to achieve. Unlike some of my other ideas, this is a tool that can demonstrate an associative model because “walking a graph” is an easy to understand way to walk related information.

There is a small body of research that has looked at similar interfaces. One that stuck in my mind was called Focus. While the authors were thinking of tabletop interfaces, the basic paradigm they describe, where one starts with a “primary file” (the focus) and then shows similar files (driven by content and meta-data) along the edges. This is remarkably like Ashish’s demo.

The exciting thing about having activity context is that it provides interesting new ways of associating files together: independent of location and clustered together by commonality. Both the demo and Focus use existing file meta-data and content similarity, which is useful. With activity context added as well, there is further information that can be used to both refine similar associations as well as cluster along a greater number of axis.

Thus, I can show off the benefits of Indaleko‘s activity context support by using a Focus-style file browser.

Better Finding: Combine Semantic and Associative Context with Indaleko

Last month I presented my thesis proposal to my PhD committee. My proposal doesn’t mean that I am done, rather it means that I have more clearly identified what I intend to make the focus of my final research.

It has certainly taken longer to get to this point than I had anticipated. Part of the challenge is that there is quite a lot of work that has been done previously around search and semantic context. Very recent work by Daniela Vianna relates to the use of “personal digital traces” to augment search. It was Dr. Vianna’s work that provided a solid theoretical basis for my own proposed work.

Our computer systems collect quite an array of information, not only about us but also about the environment in which we work.

In 1945 Vannevar Bush described the challenges to humans of finding things in a codified system of records. His observations continue to be insightful more than 75 years later:

Our ineptitude in getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.

The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.

I find myself returning to Bush’s observations. Those observations have led me to ask if it is possible for us to build systems that get us closer to this ideal?

My thesis is that collecting, storing, and disseminating information about the environment in which digital objects are being used provides us with new context that enables better finding.

So, my proposal is about how to collect, store, and disseminate this type of external contextual information. I envision combining this with existing data sources and indexing mechanisms to allow capturing activity context in which digital objects are used by humans. A systems level service that can do this will then enable a broad range of applications to exploit this information to reconstruct context that is helpful to human users. Over my next several blog posts I will describe some ideas that I have with what I envision being possible with this new service.

The title of my proposal is: Indaleko: Using System Activity Context to Improve Finding. One of the key ideas from this is the idea that we can collect information the computer might not find particularly relevant but the human user will. This could be something as simple as the ambient noise in the user’s background (“what music are you listening to?” or “Is your dog barking in the background”) or environmental events (“it is raining”) or even personal events (“my heart rate was elevated” or “I just bought a new yoga mat”). Humans associate things together – not in the same way, nor the same specific elements – using a variety of contextual mechanisms. My objective is to enable capturing data that we can then use to replicate this “associative thinking” that helps humans.

Ultimately, such a system will help human users find connections between objects. My focus is on storage because that is my background: in essence, I am interested in how the computer can extend human memory without losing the amazing flexibility of that memory to connect seemingly unrelated “things” together.

In my next several posts I will explore potential uses for Indaleko.

intricacy of trails, the detail of mental pictures, is awe-inspiring
beyond all else in nature.
This is as true in 2021 as it was in 1945. Thus, the question that mo-
tivates my research is: “Can we build systems that get us closer to that
ideal?”

Laundry Baskets: The New File System Namespace Model

A large pile of laundry in a laundry basket, with a cat sleeping on the top.
The “Laundry Basket” model for storage.

While I’ve been quiet about what I’ve been doing research-wise, I have been making forward progress. Only recently have ideas been converging towards a concrete thesis and the corresponding research questions that I need to explore as part of verifying my thesis.

I received an interesting article today showing that my research is far more relevant than I’d considered: “FILE NOT FOUND“. The article describes that the predominant organizational scheme for “Gen Z” students is the “Laundry Basket” in which all of their files are placed. This is coming as a surprise to people who have been trained in the ways of the hierarchical folder metaphor.

While going through older work, I have found it is intriguing that early researchers did not see the hierarchical design as being the pinnacle of design; rather they saw it as a stop-gap measure on the way to richer models. Researchers have explored richer models. Jeff Mogul, now at Google Research, did his PhD thesis around various ideas about improving file organization. Eno Thereska, now at Amazon, wrote an intriguing paper while at Microsoft Research entitled “Beyond file systems: understanding the nature of places where people store their data” in which he and his team pointed out that cloud storage was creating a tension between file systems and cloud storage. The article from the Verge that prompted me to write this post logically makes sense in the context of what Thereska was saying back in 2014.

The challenge is to figure out what comes instead. Two summers ago I was fortunate enough to have a very talented young intern working with me for a couple months and during that time one of the interesting things he built was a tool that viewed files as a graph rather than a tree. The focus was always at the center, but then it would be surrounded by related files. Pick one of those files and it became the central focus, with a breadcrumb trail showing how you got there but also showing other related files.

The relationships we used were fairly simple and extracted from existing file meta-data. What was actually quite fascinating about it though was that we constructed it to tie two disjoint storage locations (his local laptop and his Google Drive) together into a single namespace. It was really an electrifying demonstration and I have been working to figure out how to enable that more fully – what we had was a mock-up, with static information, but the visualization aspects of “navigating” through files was quite powerful.

I have been writing my thesis proposal, and as part of that I’ve been working through and identifying key work that has already been done. Of course my goal is to build on top of this prior work and while I have identified ways of doing this, I also see that to be truly effective it should use as much of the prior work as possible. The idea of not having directories is a surprisingly powerful one. What I hadn’t heard previously was the idea of considering it to be a “laundry basket” yet the metaphor is quite apt. Thus, the question is how to enable building tools to find the specific thing you want from the basket as quickly as possible.

For example, the author of the Verge article observed: “More broadly, directory structure connotes physical placement — the idea that a file stored on a computer is located somewhere on that computer, in a specific and discrete location.” Here is what I recently wrote in an early draft of my thesis proposal: “This work proposes to develop a model to separate naming from location, which enables the construction of dynamic cross-silo human usable name-spaces and show how that model extends the utility of computer storage to better meet the needs of human users.”

Naming tied to location is broken, at least for human users. Oh, sure, we need to keep track of where something is stored to actually retrieve the contents, but there is absolutely no reason that we need to embed that within the name we use to find that file. One reason for this is that we often choose the location due to external factors. For example, we might use cloud storage for sharing specific content with others. People that work with large data sets often use storage locations that are tuned to the needs of that particular data set. There is, however, no reason why you should store the Excel spreadsheet or Python notebook that you used to analyze that data in the same location. Right now, with hierarchical names, you need to do so in order to put them into the “right directory” with each other.

That’s just broken.

However, it’s also broken to expect human users to do the grunt work here. The reason Gen Z is using a “laundry basket” is because it doesn’t require any effort on their part to put something into that basket. The work then becomes when they need to find a particular item.

This isn’t a new idea. Vannevar Bush described this idea in 1945:

“Consider a future device for individual use, which is a sort of
mechanized private file and library. It needs a name, and, to coin
one at random, ”memex” will do. A memex is a device in which
an individual stores all his books, records, and communications,
and which is mechanized so that it may be consulted with exceeding
speed and flexibility. It is an enlarged intimate supplement to his
memory.”

He also did a good job of explaining why indexing (the basis of hierarchical file systems) was broken:

“Our ineptitude in getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.

“The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental
pictures, is awe-inspiring beyond all else in nature.”

We knew it was broken in 1945. What we’ve been doing since then is using what we’ve been given and making it work as best we can. We knew it was broken. Seltzer wrote “Hierarchical File Systems Are Dead” back in 2009. Yet, that’s what computers still serve up as our primary interface.

The question then is what the right primary interface is. Of course, while I find that interesting I work with computer systems and I am equally concerned about how we can build in better support, using the vast amount of data that we have in modern computer systems, to construct better tools for navigating through the laundry basket to find the correct thing.

How I think that should be done will have to wait for another post, since that’s the point of my thesis proposal.

Where has the time gone?

It’s been more than a year since I last posted; it’s not that I haven’t been busy, but rather that I’ve been trying to do too many things and have been (more slowly than I’d like) cutting back on some of my activities.

Still, I miss using this as a (one way) discussion about my own work. In the past year I’ve managed to publish one new (short) paper, though the amount of work that I put into it was substantial (it was just published in Computer Architecture Letters). This short article (letter) journal normally provides at most one revise and resubmit opportunity, but they gave me two such opportunities, then accepted the paper, albeit begrudgingly over the objections of Reviewer # 2 (who agreed to accept it, but didn’t change their comments).

Despite the lack of clear publications to demonstrate forward progress, I’ve been working on a couple of projects to push them along. Both were presented, in some form, at Eurosys as posters.

Since I got back from a three month stint at Microsoft Research (in the UK) I’ve been working on one of those, evolving the idea of kernel bypasses and really analyzing why we keep doing these things; this time through the lens of building user mode file systems. I really should write more about it, since that’s on the drawing board for submission this fall.

The second idea is one that stemmed from my attendance at SOSP 2019. There were three papers that spoke directly to file systems:

Each of these had important insights into the crossover between file systems and persistent memory. One of the struggles I had with that short paper was explaining to people “why file systems are necessary for using persistent memory”. I was still able to capture some of what I’d learned, but a fair bit of it was sacrificed to adding background information.

One key observation was around the size of memory pages and their impact on performance; it convinced me that we’d benefit from using ever larger page sizes for PMEM. Some of this is because persistent memory is, well, persistent and thus we don’t need to “load the contents from storage”. Instead, it is storage. So, we’re off testing out some ideas in this area to see if we can contribute some additional insight.

The other area – the one that I have been ignoring too long – is the thesis of this PhD work in the first place. Part of the challenge is to reduce the problem down to something that is tractable and can be finished in a reasonable amount of time.

Memex

One of the questions (and the one I wanted to explore when I started writing this) is a rather famous article from 1945 entitled As We May Think. Vannevar Bush described something quite understandable, yet we have not achieve this, though we have been trying – one could argue that hypertext stems from these ideas, but I would argue that hypertext links are a pale imitation of the rich assistive model Bush lays out when he describes the Memex.

Thus, to the question, which I will reserve for another day: why have we not achieved this yet? What prevents us from having this, or something better, and how can I move us towards this goal?

I suspect, but am not certain, that one culprit may be the fact we decided to stick with an existing and well-understood model of organization:

Maybe the model is wrong when the data doesn’t fit?

360° Semantic File System: Augmented Directory Navigation for Nonhierarchical Retrieval of Files

360° Semantic File System: Augmented Directory Navigation for Nonhierarchical Retrieval of Files, Syed Rahman Mashwani and Shah Kusro, in IEEE Access, January 29, 2019.

360° Perspective
360° Perspective

This paper makes some interesting observations that resonate with my own research observations, though I will end up arguing (in a future blog post) that they don’t go far enough. But they do a good job of laying out the problem and why some solutions do not work. One of the common themes I have heard when discussing my own work is an insistence there really isn’t a problem, though usually a longer conversation ends up with us agreeing things could be done better.

The abstract is a bit long, but clearly describes the essence of the paper:

The organization of files in any desktop computer has been an issue since their inception. The file systems that are available today organize files in a strict hierarchy that facilitates their retrieval either through navigation, clicking directories and sub-directories, in a tree-like structure or by searching (which allows for finding of the desired files using a search tool). Research studies show that the users rarely (4% – 15%) use the latter approach, thus leaving navigation as the main mechanism for retrieving files.
However, navigation does not allow a user to retrieve files nonhierarchically, which makes it limited in terms of time, human effort, and cognitive overload. To mitigate this issue, several semantic file systems (SFSs) have been periodically proposed that have made the nonhierarchical navigation of files possible by exploiting some basic semantics but no more than that. None of these systems consider aspects such as time, location, file movement, content similarity, and territory together with learning from user file retrieval behaviors in identifying the desired file and accessing it in less time and with minimum human and cognitive efforts.
Moreover, most of the available SFSs replace the existing le system metaphor, which is normally not acceptable to users. To mitigate these issues, this research paper proposes 360 SFS that exploits the SFS ontology to capture all the possible relevant file metadata and learns from user browsing behaviors to semantically retrieve the desired files both easily and timely. Based on user studies, the evaluation results show that the proposed 360 SFS outperforms the existing traditional directory navigation and recently open files.

Paper Abstract

Of course, the problem existed before the appearance of the desktop computer: the original UNIX contained the permuted index server, which suggests to me that even in 1973 people were struggling to find things. What I do find interesting is the observation that people really do not like search – I recently described this insightful (to me at least) observation. Here it is once again, complete with references (in the paper) to prior work demonstrating this point. Indeed, it suggests to me one alternative explanation as to why the Google Desktop Search project was ultimately cancelled – not because it was “no longer necessary” but rather because it wasn’t useful to most computer users.

Another thing that I have observed, repeatedly, when working with students, is there seems to be a natural aversion to searching for answers. Thus, students will post on class forums (such as Piazza, with which I am most familiary) asking questions, even if the question has already been asked and answered. Searching for the answer does not seem to come naturally to such students. Indeed, I often find myself using search engines to find the answer and giving the results back to the original question. I have wondered about this “laziness” in the past and with this new insight I wonder if it is just because people prefer to navigate – and being told where to go is certainly one form of navigation.

Interestingly, like the Graph File System paper I described previously, these authors also argue that we cannot abandon the hierarchical view of files. I am not convinced of this, but I can understand the appeal of starting from it as a basic premise. We have been doing some work recently on a graph visualization model for the file system, more as a prototype, but it is surprisingly functional and encouraging us to look at alternative visualizations of file system data for navigation. In other words, thinking of the problem as a search problem ultimately seems to be the wrong path – yet that is the point of things like semantic file systems, to improve search.

The paper has an extensive review of prior work, much of which I’ve also previous described, though there were a few systems I had not previously seen. Table 1 of the paper has a comparison of features across the various file systems. Thus, the authors distinguish themselves from prior by focusing on providing enhanced functionality, using auxiliary directories, in which they display related content. They focus on:

  • Temporal characteristics – they focus on when files are being used, not merely frequency. This is an idea we’ve been exploring as well.
  • Geographical location – this is intriguing; identifying where the user was when they accessed a given file.
  • File movement – when files are reorganized and moved around.
  • File access patterns – they cluster files as related based upon the temporal proximity of their access; another idea we’ve been exploring. I found it insightful they describe this as a “relationship” though they do not explore a broader range of relationships.
  • Content similarity – files that are identical or substantially similar can be associated together; this is another technique that we’ve been actively investigating.
  • Manual tagging

They describe the file system they implement, which is essentially a layered file system in which they add two virtual directories: NOW and TAGS. They describe the interface for adding this information as well, which I found cumbersome, but it is in keeping with their goal of not deviating from the existing hierarchical interface. They do also permit the creation of custom virtual directories as well, though that is only briefly mentioned in the paper.

One of the problems they highlight, which resonated with me because we’ve been discussing the same problem, is how much information to display to users – in essence, when confronted with too many options, users quickly become overwhelmed.

Their evaluation focuses on the amount of time it took users to locate their files when using their enhanced file systems model and they lay out a case for the fact their system works well for their study group. One limitation of their study group is that it is based upon an experienced computer user group, but this is reflective of their environment.

One interesting comment was that while they used Linux for their evaluation, Windows would have been a better platform because of its broader usage within their organization. I have wondered how much the use of Linux tends to create a bias in an evaluation of this type, since most people are using Windows or Apple computers. Would the results be similar?

The authors do point to their open source implementation of their file system. I have not yet evaluated it, but it is definitely something on my (all too long) list of things to do.

ZUFS

After one of my earlier posts on FUSE file system performance, someone mentioned this project to me – the Zero copy Userspace File System project (ZUFS) which appears to be a NetApp sponsored project.

Sometimes Zero is best
Sometimes Zero is best.

There have been a variety of talks about this project, including the Linux Plumber’s Conference (which was held next door to me – I can see the venue from my window as I write this), as well as the SNIA Persistent Memory Summit in 2018. The NetApp repositories on Github.com contain both a file system reflector (zufs-zuf), which appears to be similar to the FUSE kernel driver, as well as the user mode server (zufs-zus) which handles dispatching the kernel level requests to the user mode file system implementations.

Their concern appears to be eliminating the copy of any data between kernel and user mode, which makes sense given their objective of supporting persistent memory, such as the new Intel Optane DC Persistent Memory that has recently become commercially available.

Persistent memory benefits from a direct access model, in which traditional file data caching is eschewed in favor of direct access. Thus, data is read or written directly from the underlying persistent memory, rather than copied from a buffer cache.

There are a few persistent memory file systems, including UCSD’s NOVA file system, though usually they were developed using emulation of persistent memory. In such systems, there is no benefit to copying the data from persistent memory into DRAM and back; indeed, it is a significant performance impediment.

What is not currently present in the NetApp repository is an implementation of a user mode persistent file system (they have a dummy file system implementation, which appears to be the base from which one could build a real file system). This definitely presents an interesting alternative to using traditional FUSE.

Fuze vs ZUFS
FUSE vs ZUFS Performance (from NetApp SNIA presentation)

I have not had an opportunity to play with this new system yet, but it certainly does seem to be intriguing – and the performance graph from the SNIA presentation is rather compelling, given the massive improvement in scalable performance.

There sure are quite a few alternatives to traditional FUSE to consider…

A Comparison of Two Network-Based File Servers

A Comparison of Two Network-Based File Servers
James G. Mitchell and Jeremy Dion, in Communications of the ACM, April 1982, Volume 25, Number 4.

PAir of File Servers

I previously described the Cambridge File Server (CFS).  In this 1981 SOSP paper the inner details of it and the Xerox Distributed File System (XDFS) are compared.  This paper provides an interesting insight into the inner workings of these file servers.

Of course, the scale and scope of a file server in 1982 was vastly smaller than the scale and scope of file servers today.  In 1982 the disk drives used for their file servers were as large as 300MB.

SD Cards

This stands in stark contract to the sheer size of modern SD cards; I think of them as slow but compared to the disk drives of that era they are quite a bit faster not to mention smaller.  I suspect the authors of this paper might be rather surprised at how the scale has changed, yet many of the basic considerations they were making back in the early 1980s are still important today.

 

  • Access Control (Security) – CFS was, of course, a capability based system. XDFS was an identity based system; most systems today are identity based systems, though we find aspects of both in use.
  • Storage Management – the interesting challenge here is how to ensure that storage is not wasted. The naive model is to shift responsibility for proper cleanup to the clients. Of course, the reality is that this is not a good model; even in the simple case of a client that crashes, it is unlikely the client will robustly ensure that space is reclaimed in such circumstances. CFS handles this using a graph file system and performing garbage collection in which an unreachable node is deemed subject to reclamation. XDFS uses the more naive model, but mitigates this by providing a directory service that can handle proper cleanup for clients – thus clients can “do it right” with minimal fuss, but are not constrained to do so.
  • Data Consistency – the authors point to the need to have some form of transactional update model. They observe that both CFS and XDFS offer atomic transactions; this represents the strong semantic end of the design spectrum for network file servers and we will observe that one of the most successful designs (Sun’s NFS) went to a much weaker end of the design spectrum. Some of this likely reflects the database background of the authors.
  • Network Protocols – I enjoyed this section, since this is very early networking, with CFS using the predecessor of token ring and XDFS using the 3Mb/s version of Ethernet. They discuss the issues inherent in the network communcations: flow and error control (so message exchange and exception/error handling) and how the two respective systems handle them

The authors also compare details of the implementation:

  • They describe a scheme in CFS in which small files use a direct block, and larger files use indirect blocks (blocks of pointers to direct blocks). This means that small files are faster. It is similar to the model that we see in other (later) file systems, while XDFS uses binary tree, used to track allocation of blocks to files, and a bitmap, used to indicate free/used space information.
  • They discuss redundancy, with an eye towards handling (partial) disk failures. Like any physical device, the disk drives of that era did wear out and fail.
  • They discuss their transaction log and how each system guaranteed consistency: they both use shadow pages, but their implementation of them is different. Ultimately, they both have similar issues, and similar impact. Shadow pages are a technique that we still use.

The evaluation is interesting: it is not so much a measure of performance but rather insights into the strengths and weaknesses of each approach. For XDFS they note that their transaction support has been successful and it permits database transactions (in essence, XDFS becomes a form of simple database service). They point to the lack of support for both normal and special files; from their description a special file is one with guaranteed write semantics. They also observe that ownership of files is easily lost, which in turn leads to inefficient storage utilization. They observe that it is not clear if the B-tree is win or lose of XDFS.

For CFS they point to the performance requirements as being a strength, though it sounds more like a design constraint that forced the CFS developers to make “hard choices” to optimize for performance. Similarly, they observe that the directed graph model of CFS is successful and capabilities are simple to implement. Interestingly, they also point to the index as well as string of names and access rights as being a success point. They also point to the fact that CFS generalizes well (“[t]wo quite different filling systems built in this way coexist on the CFS storage.”) They also point to automatic garbage collection as being a net win for CFS, though they also point out that CFS uses a reference count in addition to the garbage collection model. They list the CFS limitation of transactions to a single file or index as being one of its shortcomings and point to real-world experience porting other operating systems to use CFS as an indicator of the cost of this limitation. Interestingly, the limitation they point to (“… since file directories are implemented as an index with an associated file, it is currently impossible to update both structures in a single transaction.”) They conclude by arguing that XDFS has a better data layout, arguing that XDFS’s strategy of page allocation and intention logging is ultimately better than CFS’s cylinder maps: “… the redundancy function of cylinder maps does not seem to be as successful as those of page allocation and intentions logging; the program to reconstruct a corrupted block is not trivial.”

Ensuring correct recovery in a transactional system certainly challenging in my experience, so I can understand the authors’ concerns about simplicity and scalability.

Overall, it is an interesting read as I can see may of the issues described here as being file systems issues; many of the techniques they describe in this paper show up in subsequent file systems. The distinction between file system and file server also becomes more clearly separated in future work.