In mulling over the depths of semantic knowledge and file systems, it occurs to me that one thing which differs between the world of Unix/Linux file systems and Windows file systems is that in Unix/Linux environments, search of a directory’s contents are done in the shell (or application) while in Windows they are a service of the file system.
I admit, when I first started working on Windows file systems, I thought this was an annoying decision, since it involved quite a bit of work inside the file system related to string handling and matching. Even as I write this, I still think that it is a lot of work that really doesn’t belong in the kernel, but, having said that, this distinction is one reason why a Unix/Linux file systems developer might not think of adding semantic support to a file system as something logical – after all, the purpose of the file system is to manage storage of file systems and associated meta-data, not to find things. Having experience in the Windows file systems space, I can understand why it might not be a great idea to do this in kernel mode. After all, C is not a language well-known for its strength and safety in handling strings, and the kernel is not an environment well-known for its tolerance of C runtime error tolerance.
But I digress. The point is this: when we begin to embed semantic knowledge inside the file system, we exploit a model in which the file system is involved in the search function and this would seem to be anathema to normal file systems behavior. This is a good challenge: does this need to be done in the file system? If not, perhaps there is instead an abstraction that the file system itself must be able to provide.
Each time I tackle this problem, my general sense is that the model I want is a case in which each file has a set of attributes. Ideally, what I want is some way to quickly and efficiently find things based upon those attributes. After all, how hard could this be?
One benefit to the current search paradigm with which users have been trained is that it does not provide reproducible search results. Thus, nobody will really be surprised if they repeat a search today and get back different results than they got back yesterday.
Hence, I keep coming back to this paradigm. It also gives me the sense that there are different characteristics of such a system – there are persistent attributes, like the timestamp, and ephemeral attributes, like semantic tags.
Plenty to think about, but this idea of where to draw the line of search is an important one. In either case, though, I need to determine efficient ways of rapidly finding files based upon these attributes.
This week I had the opportunity to attend ACM SIGOPS HotOS ’17. This is a workshop that focuses on discussing work in some stage of progress. The papers are not as polished as one would find at a full conference and in some cases they are more about being provocative – encouraging the reader to consider some existing problem in a new light.
It is a small function, with a limited number of people invited. Because I am working with one of the people involved in organizing it, when an opportunity to help out arose I was invited to attend. My role was to help out with various functions, including assisting in the live blogging of the workshop (hotos17.tumblr.com) as well as leading tours, handling registration, and generally being part of making the participants feel welcome.
It also provided an opportunity to begin learning more about members of the research community. Several things surprised me somewhat: (1) these are highly specialized researchers. The field is broad enough that there were only ever a handful of experts in any one aspect of it as the discussions and papers flowed around. (2) I was able to actually really understand a couple of the talks. One of them made me angry, partially because it was an attack on people not present that I know. Still, many of the talks were in areas where I only had general understanding, not a specific understanding of the subject matter.
The one talk that annoyed me? Transactional file systems!
The decision deadline was last weekend and I confirmed my decision, accepting one of the offers and declining the other. I’m excited that I’ll be starting my new program officially near the end of summer.
I’m already starting to think about the fact that this journey is really about to begin. I’d be lying if I said that I weren’t still in a bit of shock – euophoric shock, for sure, but still shock. Despite everything that I had going against me, I managed to make this happen – and no amount of actually thinking about it makes me view it as quite a (wonderful) surprise.
For example, if I hadn’t been in the position to take the research assistant position with UBC, I suspect I’d have been less likely to have been offered admission there. Of course, had I not been summarily dismissed from my job just a couple weeks earlier, i wouldn’t have been in a position to do so. Pretty exciting, indeed!