Let’s start by looking reviewing variation 2 in KataTwoRecursive. When we recurse, we need to carefully pass in the position of the subarray in the whole array so that the final result is correctly positioned in the whole array. The recursive calls looks like this …
if list[mid] > target
chop_recursive(target, list[0...mid], offset)
else
chop_recursive(target, list[mid..-1], offset+mid)
end
Note that if we didn’t have to worry about the returning a -1 when the target value is not found, we could have written the recursive calls like this.
if list[mid] > target
chop_recursive(target, list[0...mid]) # [1]
else
chop_recursive(target, list[mid..-1]) + mid # [2]
end
As chop_recursive unwinds the stack, the correct offsets are added one by one until the final result correctly represents the correct value for the whole array.
There is an interesting difference between the recursive call at [1] and the recursive call at [2]. The call at [1] is tail recursive. This means that the result returned by the call at [1] is the result for the entire function. Once the call at [1] terminates, there is nothing to do but immediately return the result we just received.
A smart optimizing compiler can turn tail recursive calls into a jump to the beginning of the function (after reassigning the function parameters). In other words, tail recursion can be implemented using iteration without any stack growth.
There are a few languages where the tail recursion optimization is guaranteed by the language. Scheme is one such language. In Scheme, you never have to write a loop, the language will turn your tail recursive calls into loops automatically.
Unfortunately, the recursive call at [2] is not tail recursive. When [2] returns, it must continue to execute adding mid to result before returning. This little bit of work that must be done between the return of the call at [2], and the return of the main function call is called the continuation of [2]. It is where call [2] continues when it is done.
Wouldn’t it be nice if we could turn the call at [2] into a tail recursive call? Then our entire function would be tail recursive and it could be implemented without stack growth.
This is actually not hard to do. Suppose we had a function that handled that little "+ mid" code for us. When we are ready to return a value, then we just need to let the contination handle any of the "adjustments" before returning the value. We could pass in the continuation function as an argument to the function and call it when we need to return.
Something like this …
def chop_cps(target, list, continuation)
if list.size <= 1
(list[0] == target) ? continuation.call(0) : XXXXX
...
This says return the value +0+ if the list[0] matches the target value. The continuation function will apply any adjusts specified by the calling context. (The "XXXXX" represents how we handle the -1 return value. We will come back to this in a bit.)
Now we just need to create the continuations at the right points.
If we are recursing on the first half of the array (line [1] above), then there is no need to add anything. Our contination will just return the input value directly.
chop_cps(target, list[0...mid], proc{|x| continuation.call(x) }) # [1a]
Remember, continuation is the continuation of our current function call. In other words it is the code to execute when our current fuction returns. So, in order to properly return a value, we need to call continuation, passing in our result.
If we need to adjust the returned result by adding mid (line [2] above), then the continuation we pass needs to look like this …
chop_cps(target, list[mid..-1], proc{|x| continuation.call(x+mid) }) # [2a]
One final optimization. We notice that in [1a] we are just passing the return value without changing it. We can shorten that to just …
chop_cps(target, list[0...mid], continuation) # [1b]
Ok, we are ready to write the CPS style chop function … except for one thing! We don’t know how to handle the case where the target value is not found!
The spec for the chop function says that we will return a negative one whenever the target value is not found. Unfortunately, if we return a negative one, the continuations from the calling methods clobber our error return by (possibly) adding the valud of mid to it (remember [2a] above?).
What we need is an alternative continuation where no adjustment is done. No problem! Since we are just using functions for continuations, we can pass in an "error reporting" continuation to handle failure. Our error return continuation would look like this:
proc { -1 }
Here’s the code to the recursive version using CPS. Since we are passing two continuations into our recursive function, we will call the continuation for the normal return found and the continuation for the error return not_found.
def chop_cps(target, list)
sub_chop_cps(target, list, proc{|x| x}, proc{-1})
end
def sub_chop_cps(target, list, found, not_found)
if list.size <= 1
(list[0] == target) ? found.call(0) : not_found.call
else
mid = list.size/2
if list[mid] > target
sub_chop_cps(
target,
list[0...mid],
found,
not_found)
else
sub_chop_cps(
target,
list[mid..-1],
proc{|x| found.call(x+mid)},
not_found)
end
end
end
class TestChopCpsRecursive < Test::Unit::TestCase
alias chop chop_cps
include ChopTests
end
This implementation turned out to be surprisingly easy. No problems were encountered during the implementation.
I find the introduction of the not_found continuation very interesting. Why does the chop function return a negative one as an error value? Why doesn’t it throw an exception?
With the not_found continuation, we can decide at runtime how errors are to be reported. For example, to switch to an exception throwing version of chop, all we have to do is …
def chop_cps(target, list)
sub_chop_cps(
target,
list,
proc{|x| x},
proc{ fail "Value #{target} not found." })
end
This is cool!
Some of you are no doubt surprised that this entire discussion of continuations never touched on Ruby’s callcc mechanism. We will save callcc for the next session, along with two or three more variations on the CPS variation introduced here.