{ |one, step, back| } 1 of 1 article WikiSyndicate: full/short

Slowing Down Calculations   08 Dec 04
[ print link all ]
Normally you want your calculations to go as fast as possible. But not this time. Here we will look at how to slow them down. Slow them waaay down.

Time for Some Fun

If you are following the Ruby mailing list, you will see we just made a new release of RubyGems. For the past week, most of my spare time has gone into making that happen. Now its time to get back into some other projects that I’m woefully behind on …

But before we do, it’s time to have a little fun with Ruby.

A Language Based SpreadSheet

Patrick Logan wonders aloud why spread sheets haven’t made it into programming languages as first class objects. He then goes on to speculate what it might look like in Smalltalk. And along the way, he bumps into some interesting concepts about evaluating expressions.

Translating Patrick’s Smalltalk code into Ruby turns out to be quite straight forward …

  ss = SpreadSheet.new
  ss[1,1].value = 5
  ss[1,2].value = ss[1,1] * 6
  ss[1,3].value = ss[1,2] * 7
  puts ss[1,3].value             # => 210

ss is our spread sheet object. Indexing it with row and column indicies yields cells, which can hold formulas. The first cell gets a five. The second cell (at [1,,2]) gets 6 times that (30), and the final cell gets 7 times that (210).

Straight forward arithmetic, right?

Wrong!

Remember that this is a spread sheet. Changing the values in cells should effect cells that derive from it. In other words, if we change the value of ss[1,1] and then ask for the value of ss[1,3], we should get a new value.

  ss[1,1].value = 10
  puts ss[1,3].value     # Should now print 420!

The naive implementation of just storing values in arrays as we calculate them just isn’t going to hack it.

Ummm … Why Don’t We Use Lambdas?

My first thought was that we would want to just wrap the expressions in a lambda. Putting them in a lambda has the effect of deferring the evaluation of the expression until we really want it. And we can reevaluate the expression as needed as its dependents change.

The result would look like this …

  ss = SpreadSheet.new
  ss[1,1].value = 5
  ss[1,2].value = lambda { ss[1,1] * 6 }
  ss[1,3].value = lambda { ss[1,2] * 7 }
  puts ss[1,3].value             # => 210

In fact, a commenter to Patrick’s blog suggested the same thing.

Patrick responds.

The problem then is the programmer has to know where to put the blocks to delay computation and where to force the revaluations. Instead of blocks, though, my intention is to use some new object, a Formula, say, and to make those objects implicit as much as possible.

Deferring Expressions in Ruby

How would this work out in Ruby? In Ruby, all computation is accomplished by sending messages. To defer a computation, just capture its message and replay it later at your convenience. We will start with a formula object that turns any message sent to it into a deferred object.

  class Formula < Builder::BlankSlate
    def method_missing(sym, *args, &block)
      Deferred.new(self, sym, args, block)
    end
  end

Since you generally want every message sent to a formula object to be recorded, I based Formula on the BlankSlate class I’ve mentioned earlier. Deferred is also fairly straight forward to write. The initialize method just records the receiver of a message, the message name, arguments and any blocks.

  class Deferred < Formula
    def initialize(target, operation, args, block)
      @target = target
      @operation = operation
      @args = args
      @block = block
    end
    # ...

We derive Deferred from Formula because we want operations against a deferred object to also be deferred, and Formula handles that perfectly.

Now, when we want the value of a deferred operation, we need to ask it nicely. We will use a method named formula_value for that purpose.

    def formula_value
      @target.formula_value.send(@operation, *eval_args, &@block)
    end
    # ...

To replay the deferred operation, we need to get the target (reciever) of the message. Since the target was a formula, we need to ask for its formula value. We then send it the original operation (message name) and a list of arguments. Just one subtle note about the arguments. They too might (or might not) be formula objects, in which case we need to evaluate them before passing them to send. eval_args just creates a new list of formula values from the original argument list.

    def eval_args
      @args.collect { |a| a.formula_value }
    end
  end  # of class Deferred

Formulas in Action

Suppose I had a Formula object, and tried to add 1 to it. What would I get …

  result = some_formula + 1

result is a new deferred object, capturing the sending of +1 to the formula. To get the real value, we ask the result for its formula_value:

  result.formula_value

which recursively gets the formula value of some_formula and sends it a +1 message.

Something is Missing

Great. We see how to get deferred evaluations if we have a formula object, but where does that original formula object come from?

There are several options. One is to create a Formula wrapper around a normal Ruby value.

  class Const < Formula
    attr_reader :formula_value
    def initialize(value)
      @formula_value = value
    end
  end

Now we can say:

  result = Const.new(5) + 3    # Returns a deferred formula object

and later ask:

  result.formula_value         # Returns 8

Mirror, Mirror

Another question: I can see how formula+1 works. The + message is sent to a formula object and gets handled specially. But what if we write:

  1 + formula

Won’t 1 get confused because it doesn’t know how to handle a formula object?

Well, yes and no. The + operation in Integer will get confused, but it has a backup procedure to follow when it doesn’t know how to add itself to an arbitrary object: It asks the object how to do it.

Integer + will call coerce on the Formula object and expect Formula to figure out how to do the addition. Coerce should return a two element list whose elements can be added together.

Here is coerce for Formula:

  class Formula
    def coerce(other)
      [Const.new(other), self]
    end
  end

Formula just wraps the original number in a Formula wrapper and then lets the wrapper handle deferring the addition.

The SpreadSheet Cells

The Cells of the spreadsheet are Formula objects too. The only catch here is that they can hold different Formula objects over time. value should calculate the value of a cell (and will in turn call formula_value to do so). value=(val) should set the internal formula to whatever the user specifies.

Here is Cell in all its glory.

  class Cell < Formula
    def initialize
      @formula = 0
    end

    def value
      @formula.formula_value
    end

    def value=(new_value)
      @formula = new_value
    end

    def formula_value
      @formula.formula_value
    end
  end

Making a Cell object a formula is pretty key to the whole deferred calculation design. If a cell contains only normal Ruby objects, then its value can be calculated immediately. However, we want to defer calculations that depend upon other cells, and since adding a cell object to any expression will automatically defer it, we get the effect we want automatically.

Hey! Wait a Minute!

The astute observer will note that I initialize @formula in Cell to 0, but later in the code send a formula_value message to @formula. Won’t this cause problem, sending formula message to a non-formula object?

Well, yes it would. But it simplifies the code greatly if we just pretend all objects are formulas and can respond to formula_value. Non-formula objects just respond by returning themselves.

This is easy to accomplish.

  module Kernel
    def formula_value
      self
    end
  end

(And this is why I chose a rather awkward name for fetching a formula’s value. Since all objects will implement this method, I wanted it to avoid potential name conflicts).

The SpreadSheet

The final piece of the puzzle is the actual spread sheet object. Its only real function is to act as a lookup container for the Cell objects. My implementation uses a cheesy "convert two integers to a string key" technique. It works ok, but it bothers me.

  class SpreadSheet
    def initialize
      @hash = Hash.new
    end

    def [](r,c)
      @hash["#{r}@#{c}"] ||= Cell.new
    end
  end

That’s it. The guts of a spread sheet in under 100 lines of code. You can see the complete listing on my wiki.

Other possibilities

This technique of deferring calculations has some really interesting possibilities. Imagine that instead of replaying deferred calculations, we examined the deferred expression and performed some operation on them. Like what? Well, maybe optimizing the expression, maybe analyzing the variable references, maybe generating SQL from them. Remember the Criteria library … it uses this technique to translate ruby code into SQL conditions.

Limitations

Be careful with this technique. Although quite powerful, there are a couple of places that can cause problems. In particular, the && and || operators in Ruby are difficult to capture in this way.


 

Formatted: 30-Aug-08 05:29
Feedback: jim@weirichhouse.org