Lots of stuff to do

Computers, Programming Languages and Operating Systems

Sunday, July 09, 2006

Time to Save - What are Files?

Files as time-evolvers and Orthogonal persistence
If you think about it, all a file does is sit there. It does nothing. A file is there to allow you to save things between visits, and while it sits there, it does nothing. No wonder people are suggesting to use orthogonal persistance to replace files. What does that mean? Well, orthogonal persistance is two things:
  1. persistance - where all information will persist, even after you close the computer
  2. orthogonal - where the programmer does not have to do any special coding to take advantage of this
So that would make an operating system not have a close button for programs, and programs not have a save button (although they might have a "version" button to allow you to create versions such as final, draft, editing etc). Why? Well, when you close a program, the OS will automatically "remember" the program for you, and when you restart the program, you get everything that was ever opened. So that way, you never actually start a program, you just unfreeze it, or freeze it. And the programmer has the ease of not actually moving a muscle to get this added bonus - the operating system gives you the orthogonal bit too. In fact, tunes.org wiki says that "IBM evaluated overhead of explicit fetching and storing of data to 30% of total program code, not taking into account check, conversion, and recovery of data".

Of course, this might get a little silly if you have, say, a thousand documents, and you dont actually want to mess around with the other nine hundred that are not important to you at the moment. A more serious problem, it is the difficulty of making such a system both stable and efficient. The efficiency trouble is easy to see - we might end up loading what is not required, or loading memory that contains uninitalised garbage. The stability is a bit less obvious - it is caused by the fact that the context of the computer has changed. That is like freezing an accountant in the 1950s and then unfreezing them now in 2005 and expecting them to work properly. They will not understand their context - names might have changed, technologies will have changed, people have changed, interfaces and more have changed. This would lead to a very unstable accountant, or program for that matter.

We can solve these problems if we tried hard enough. For example, we can make environments where all contexts are essentially constant. However, that is too hard for me to discuss ^^

Files as naked information
Well, on anothe hand, lets see what exactly we put into data structures. Here is a quick list
  1. We have plain information into what we call data
  2. We have information describing what that plain information is into what we call metadata
  3. We have information describing relationships between information into what we call pointers
  4. We have juxtapositions (non-alike information put together) of these information into what we call types
  5. We have concatenations (alike information put together) into what we call arrays or lists
  6. We have juxtapositions of these types with functions into what we call objects
  7. We have juxtapositions of related information and we call it a file when it is on a hdd
  8. We have juxtapositions of files onto what we call a file system.
So there, we have a nice slice-of-life of information in its local context - the array, type, object, file, file system. Of course, this goes well with my idea of viewers, where you simply give the information (whether it be files or otherwise), and metainformation about this information, and the computer can generate an interface. There has been an interesting movement in the smalltalk community to create a visual information display system (to my limited reading of it) called morphic. There has also been another interesting movement called naked objects, along similar lines.

The next step now that we have a nice street directory of information, is to create names for the streets of information. That is, how do we assign names to this endless jungle of information that is apparently there. Well, the current way is to make a file path. An analogy would be that this file path is a street address. To access what is inside the house, you have to look up who owns the house (i.e. what program owns the filetype), ring them up (execute the program), and then ask the owner to access the information inside it (navigate the program). It would be better if what happened was that the owner gave you a map of the house (programs extract or find metadata about the information), and allow you direct access via another resource locator.

In this way, the arbitary limits of files will be removed, and replaced with what I call transform points. For example, suppose you wanted to access your music. All your music is located inside a file called music. When you open that file, there is a transform point, where a program comes along and reads the file, and gives you a description of the file. Without opening another program, your file explorer has entered into inside the file. Sort of like having a hardlink to a folder, but much more powerful. Sorry about the bad wording of this paragraph. Needs reworking.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home