Lots of stuff to do

Computers, Programming Languages and Operating Systems

Wednesday, July 05, 2006

Speeding up small programs

The use of small functions that do specific jobs well is an important way to make flexibility. However, one of the troubles of working with such small programs the efficiency loss that is accompanied by having pipelines that write to memory. There are a few sources:
  1. Program instantiation is much slower than function calls - the OS has to create new thread space, allocate/page memory
  2. Programs need to pass information, and the only way would be to either write to a shared memory space, or something that emulates such a space (e.g. a temporary file etc)
  3. Programs might need specific information, and discard the rest of the information, thus making it a waste to compute all the information.
  4. Programs that run on a list of inputs would need to be instantiated for each element of the list
  5. Programs might be called several times for the same information, and optimisations that rely on repeated program calls cannot be made
  6. Programs that I am proposing do not do any variety of input/output. They are very much just processing boxes hexagon architecture. They will all have to use the same input/output "ports".
The reason that normal programs do not suffer these problems are:
  1. Function calls are fast, as long as they are to the same block of memory.
  2. Functions that are compiled can use registers to store information. Reading/writing to register is faster than to memory, and avoids caching problems.
  3. Functions can have callback hooks
  4. Functions can be designed to have input-output loops. Also, the coder could recognise which variables need to be reset to starting values.
  5. Functions results can be stored, and the way the function responds can be individually optimised
  6. Functions can do anything - i.e. they are not restricted in any way. That means that they can be optimised for whatever they are doing.
My idea for making functions work faster is to allow:
  1. work at a bytecode level
  2. lazyness, ranging from extreme to novice (explained later)
  3. fusion
  4. stitching
The primary way that the programs/functions are all written at a byte code level. This of course offers us the benefits of portability - like that of java. However, the type of bytecode I am interested in is at a slightly higher level - it must allow lazy bytecoding, and fusion to occur. This leads to our concept of lazyness, that is, you don't do anything until you need to. Lazyness will basically work by allowing whatever needs the information to query the function for information, rather than execute the function. There are several levels of lazyness, as some functions can be made to process only a little information to answer a specific query, while others need to process all the information before returning results. The levels of lazyness are:
  1. a program that takes a file, and gives a file of the same structure, but with different information in some places, and you know exactly what places the information will be different in. for example a program that changes an element in an array.
  2. you can ask for a specific set of information at (almost) no overhead cost. for example a program that will tell you the value of an element in an array.
  3. you can ask for information in a linear way. for example a program that will give running sums of an array, a fibonacci function.
  4. you can only ask for all the information. for example floyd's algorithm.
Now depending on the level of lazyness, it is possible to fuse the bytecode together (remember, we are using high level bytecode), and then turn this bytecode into some form of native assembler code. This would thus give you what appears to be a fully optimised program composed of many small blocks that may be slow if not for lazyness.

As to how this will fix many of our problems, in the same order:
  1. Program are stitched together in one space
  2. Shared information are handled via lazyness, and since we are using information querying, it is possible to store the results of each query in registers or in smaller temporary memory areas
  3. Lazyness removes problems of discarded information. What does not need to be discarded will not
  4. It is possible for the bytecode to be high level enough (like J functions are), to be automatically ranked, and thus work on lists without extra manipulation
  5. Memoising bytecode? (doubtful)
Basically one interesting idea is that we can have the ideal world, where everything is reflected and all, and then we write optimised code for special cases, and in most cases where we do not use any of the reflecive power of the computer, the optimised code is run instead of the reflected bytecode etc. So we do not loose out from our flexibility-speed tradeoff.

Here is an interesting similar idea to mine. They use a functional language with precompiled functions. Type checking is done and then the precompiled programs are literally stitched next to each other. This provides the best of both worlds. It is called Esther. It is fast because it involves no interpretation nor compilation. That makes it possibly faster than c in terms of compiling (^^), and faster than hugs in terms of running (^^).

2 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home