{ |one, step, back| } 1 of 1 article Syndicate: full/short

Using FlexMock to Test Computational Fluid Dynamics Code   03 Aug 07
[ print link all ]

This is a fun example of using FlexMock

Andrew Sweeney Asks:

Andrew Sweeney emailed me with the following question:

I am currently working on a ruby project in which I think flexmock would be a good fit for unit testing. I have read the documentation and gone over the examples however fail to wrap my head around how to apply flexmock to my own app. I was hoping that you could give me some guidence and get me started or point me in the right direction.

You can find his original source code here.

I thought his problem was interesting enough to write it up as an example of using FlexMock. Andrew and his mentor, Bil Kleb gave permission for me to reproduce the code in my blog. The F3DQueue class is part of a Computational Fluid Dynamics project (http://fun3d.larc.nasa.gov) at NASA.

Quick Code Review

The F3DQueue class is small, so there’s not a lot of code we need to wade through. We see it uses a second class named AutoF3D, but the only clues we have to what AutoF3D might do are the four method calls on the “job” object in the run method.

It looks like the main interface to the queue object is the add_to_queue method. There is a thread started that pulls jobs (i.e. AutoF3D objects) from the queue and processes them in turn. There is some server delays built into the system. I presume that Computational Fluid Dynamics is, ummm, computationally complex and the delays are just there to make sure the workload does eat up all the CPU time on the server.

Starting Testing

When writing new code, I always like to approach it in a Test-First manner. Because I won’t write solution code without a test that forces me to write it, I have a high confidence that the code is well covered with tests.

Unfortunately, dealing with legacy code means that the code is already written and the test-first approach won’t work. That’s ok, I have a little trick that I use. Just comment out the bodies of all the methods in the class you are about to test. Then write the tests that force you to uncomment the code. Just uncomment only enought to get the tests to pass, don’t uncomment anything you don’t have to. You have enough tests when all the code has been uncommented. The technique is almost as good as doing real test-first.

The Commented Out Version

Here is the code base as I started the test.

An Existence Test

I almost always start out with an existence test. Existence tests basically prove the proper files are included and the object can be created. Normally I delete these after a few tests have been written. But I left this one in for an example.

   def test_initial_conditions
    q = F3DQueue.new
    assert_not_nil q
  end

Nothing really exciting here. Let’s move on …

Proving FIFO Queue Order

The first thing I want to prove is that items put into the queue are removed in FIFO order. Since add_to_queue creates a AutoF3D object, I mock out the new method on the class object and tell FlexMock to expect new to be called twice. Once with :a, :b, and :c as parameters, then again with :x, :y, :z paramters. Each invocation of new will return a different symbol (:first and :second) so we can easily test the items are pulled off the queue in FIFO order.

Notice that I pass in simple symbols for the arguments to add_to_queue. Our code doesn’t interpret the values of the arguments, they are merely passed directly to the AutoF3D constructor. All we do is verify that the AutoF3D (mocked) constructor does indeed receive the arguments we pass in.

Here’s the test:

  def test_adding_to_queue_is_removed_in_fifo_order
    flexmock(AutoF3D).should_receive(:new).once.with(:a, :b, :c).and_return(:first).ordered
    flexmock(AutoF3D).should_receive(:new).once.with(:x, :y, :z).and_return(:second).ordered
    q = F3DQueue.new

    q.add_to_queue(:a, :b, :c)
    q.add_to_queue(:x, :y, :z)

    assert_equal :first, q.remove_from_queue
    assert_equal :second, q.remove_from_queue
  end

This test caused three changes. First, the add_to_queue method needed lines uncommented:

  def add_to_queue(modelLoc, params, gridFile)
     autoF3D = AutoF3D.new(modelLoc, params, gridFile)
     @queue.push autoF3D
#     $log.info 'Request added to queue'
  end

(Notice I didn’t uncomment the log. The logger is not needed to pass the test, and doesn’t contribute to the actual functionality of the method. I will not be testing the logger in the for the purposes of this article.)

Also the remove_from_queue needed its body uncommented:

  def remove_from_queue
    @queue.pop
  end

And finally, the initializer code needed to create the queue array:

  def initialize  
     @queue = []
#     Thread.new{ process }
  end

Notice that the Thread.new line is left commented. We will deal with that in a bit.

So now we run the test:

$ ruby test_f3dqueue.rb
Started
F.
Finished in 0.010184 seconds.

  1) Failure:
test_adding_to_queue_is_removed_in_fifo_order(TestF3DQueue) [test_f3dqueue.rb:23]:
<:first> expected but was
<:second>.

2 tests, 2 assertions, 1 failures, 0 errors

Oops! This test uncovered the first bug. The code as written has stack behavior (i.e. LIFO). The naming seems to indicate that we want FIFO.

No problem. That’s an easy fix.

  def remove_from_queue
    @queue.shift
  end

Now the tests run clean:

$ ruby test_f3dqueue.rb
Started
..
Finished in 0.001925 seconds.

2 tests, 3 assertions, 0 failures, 0 errors

Proving that Running a Job Works

Now when I run a job, I need to show that the proper four methods are called once each and in the proper order. This is very straight forward using FlexMock.

  def test_running_a_job_will_call_the_right_stuff_in_the_right_order
    job = flexmock("job")
    job.should_receive(:generate_geometry_and_grid).once.ordered
    job.should_receive(:partition_grid_and_initialize_flow).once.ordered
    job.should_receive(:run_flow_solver).once.ordered
    job.should_receive(:post_process_solution).once.ordered
    q = F3DQueue.new

    q.run(job)
  end

Uncommenting the body of run is all that is needed here:

  def run( job )
#     $log.info 'Request being processed'
     job.generate_geometry_and_grid
#     $log.info 'Created Geometry'
     job.partition_grid_and_initialize_flow
#     $log.info 'Partitioned Grid'
     job.run_flow_solver
#     $log.info 'Flow Solver Completed'
     job.post_process_solution
#     $log.info 'Post process Completed'
#     $log.info 'Request completed'
  end

Test are now showing:

3 tests, 3 assertions, 0 failures, 0 errors

Processing an Empty Queue

Ok, now it gets interesting. I want to show that attempting to process a job when the queue is empty will cause the process to sleep for the check queue interval.

This is one spot where I changed the code to make it easier to test. It is difficult to test endless loops in unit tests (it tends to make the tests run a bit long), so I broke out the logic for a single pass through the loop into a method called process_one_job. We can then test this logic without dealing with the looping at the same time.

Note: It is possible to test endless loops and an example will be given below. But it is slightly tricky and this allows us to concentrate on proving the logic.

If there are no jobs to be processed, then all the code should do is sleep for a particular amount of time. We will locally mock out the sleep method on the queue object and insist that it will be called exactly once with the expected interval.

  def test_processing_with_no_jobs_will_sleep_the_check_interval
    q = F3DQueue.new
    flexmock(q).should_receive(:sleep).once.with(F3DQueue::CHECK_QUEUE_INTERVAL)

    q.process_one_job
  end

Here is process_one_job with just two lines uncommented so that the test will pass.

  def process_one_job
#       execution_attempts = 0
    job = remove_from_queue
#       begin
#         if job
#       run job
#           execution_attempts = 0
#           sleep SERVER_RECOVERY_TIME
#         else
    sleep CHECK_QUEUE_INTERVAL
#         end
#       rescue
#         $log.warn 'An error occurred during execution'
#         $log.warn $ERROR_INFO
#         $log.debug $ERROR_POSITION
#         sleep SERVER_RECOVERY_TIME
#         if execution_attempts > MAX_EXECUTION_ATTEMPTS
#           $log.error 'Too many failed execution_attempts: aborting'
#           raise
#         else
#           execution_attempts += 1
#           retry
#         end
#       end
   end

There’s a lot of code still left commented in that method. Now we need a test to force us to uncomment more code.

Handling a Single Job

Ok, now what happens when a single job is in the queue. We will assume the happy path (i.e. no exceptions) so we expect run to be called with the queued object, and then a sleep with the recovery interval.

A couple of things to note. First, we mock out AutoF3D again so that when we request something added to the queue, we control what kind of object is returned. We could return a mock object and then mock out the four methods that run will be calling.

However, I chose a slightly different approach. AutoF3D is mocked so that it returns a simple symbol. Then I mock out the run method to do nothing (but it is expected to be called once). This is slightly controversial because I am actually mocking a method on the object under test. But the run method is fairly simple, and we know that run works because of our previous test, so in the end we get clearer and simpler code.

Also note that the run and sleep methods mocks are ordered. This means run will be called first, then sleep.

  def test_processing_with_a_single_job_will_run_the_job_and_pause_for_recovery
    q = F3DQueue.new
    flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
    flexmock(q).should_receive(:run).once.with(:job).ordered
    flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
    q.add_to_queue(:a, :b, :c)

    q.process_one_job
  end

Now we get to uncomment even more lines in process_one_job.

  def process_one_job
#       execution_attempts = 0
    job = remove_from_queue
#       begin
    if job
      run job
#           execution_attempts = 0
      sleep SERVER_RECOVERY_TIME
    else
      sleep CHECK_QUEUE_INTERVAL
    end
#       rescue
#         $log.warn 'An error occurred during execution'
#         $log.warn $ERROR_INFO
#         $log.debug $ERROR_POSITION
#         sleep SERVER_RECOVERY_TIME
#         if execution_attempts > MAX_EXECUTION_ATTEMPTS
#           $log.error 'Too many failed execution_attempts: aborting'
#           raise
#         else
#           execution_attempts += 1
#           retry
#         end
#       end
   end

That just leaves the error handling code to be uncommented. So that will be next.

Handling a Job With Errors

Now we want to test the case where processing a job will return an exception. This test exercise the exception recovery code in the original code base. The technique is similar to the last test, but this time we specify two mock calls for run. The first time run will return an exception. The second time it is called, it will complete normally.

Notice that we have ordered run and sleep so that they interleave execution with each other.

  def test_if_a_job_fails_retry_after_recovery_time
    q = F3DQueue.new
    flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
    flexmock(q).should_receive(:run).once.with(:job).and_raise(RuntimeError).ordered
    flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
    flexmock(q).should_receive(:run).once.with(:job).ordered
    flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
    q.add_to_queue(:a, :b, :c)

    q.process_one_job
  end

I was showing this test code to one of my coworkers and they were a little surprised that the second expectation on run didn’t override the first expectation. FlexMock is explicitly designed to allow you to stack expectations like this. When searching for an expectation during mocking, FlexMock will use the first one matching one if finds. When an expectation has been used its designated number of times (in the above test, the once method designates that the expectation should only be used once), FlexMock will begin to use matching expectations that are defined later.

The upshot is this is that it is easy to define mock behavior for multiple calls to the same method.

Here’s the latest process_one_job method with some more lines uncommented. We are getting close to the end with this one.

  def process_one_job
#       execution_attempts = 0
    job = remove_from_queue
    begin
      if job
        run job
#           execution_attempts = 0
        sleep SERVER_RECOVERY_TIME
      else
        sleep CHECK_QUEUE_INTERVAL
      end
    rescue
#         $log.warn 'An error occurred during execution'
#         $log.warn $ERROR_INFO
#         $log.debug $ERROR_POSITION
      sleep SERVER_RECOVERY_TIME
#         if execution_attempts > MAX_EXECUTION_ATTEMPTS
#           $log.error 'Too many failed execution_attempts: aborting'
#           raise
#         else
#           execution_attempts += 1
      retry
#         end
    end
  end

Processing Jobs that Continually Fail

Finally we test the case where the job will continually raise an exception until the error recovery code gives up and passes the exception on to the caller. I didn’t bother ordering the run/sleep calls here, making it easy to just specify that each are called four times. I believe that the previous test adequately specified interleaving.

I used a RuntimeError for my testing. If you have a specific error in mind, you might want to test explicitly for it. Generally raising the most general error you intend to handle is a good way of testing the boundry conditions on your rescue clause.

  def test_too_many_failures_will_pass_along_exception
    q = F3DQueue.new
    flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
    flexmock(q).should_receive(:run).with(:job).and_raise(RuntimeError.new("XYZZY")).times(4)
    flexmock(q).should_receive(:sleep).with(F3DQueue::SERVER_RECOVERY_TIME).times(4)
    q.add_to_queue(:a, :b, :c)

    ex = assert_raise RuntimeError do
      q.process_one_job
    end
    assert_equal "XYZZY", ex.message
  end

Note that the exception needs to be raised four times. I suspect this is a bug in the error handling logic. I left the logic as is and just made sure the test will pass. The code base specifies a retry count of “2”. This seems to imply that we try run twice, or perhaps three times (if the initail attempt doesn’t count as a retry). In any case, four times seems too much.

So, here is the code for process_one_job with most of its lines uncommented.

  def process_one_job
    execution_attempts = 0
    job = remove_from_queue
    begin
      if job
        run job
#           execution_attempts = 0
        sleep SERVER_RECOVERY_TIME
      else
        sleep CHECK_QUEUE_INTERVAL
      end
    rescue
#         $log.warn 'An error occurred during execution'
#         $log.warn $ERROR_INFO
#         $log.debug $ERROR_POSITION
      sleep SERVER_RECOVERY_TIME
      if execution_attempts > MAX_EXECUTION_ATTEMPTS
#           $log.error 'Too many failed execution_attempts: aborting'
        raise
      else
        execution_attempts += 1
        retry
      end
    end
  end

Again note that this test surfaced a (rather minor) bug. There is an extra assignment that clears the execution attempt counter after a successful run of job. Since a successful run will exit the loop, clearing it has no effect (unless it is the sleep command that fails, that would be an interesting test scenario).

Since we haven’t shown the test results for a while, here’s how we stand at this point:

7 tests, 5 assertions, 0 failures, 0 errors

Processing Multiple Jobs

Now we know that we can handle a single job successfully. Now let’s make sure that we can handle multiple jobs. Remember that we broke process into two methods: process_one_job and a much shorter process that will call process_one_job in a loop.

Here’s what the original process method is looking like at the moment:

  def process
#     loop do
#     end
  end

We pulled out its guts and left the still commented loop there. We haven’t even bothered to have it call process_one_job yet. So let’s write a test that will force us to fix that.

We will just mock out process_one_job so that it must be called 10 times. On the eleventh call it throws a symbol that we catch in the test. Throwing a symbol is the trick that breaks us out of the infinite loop. By throwing a symbol (rather than raising an error), we don’t interact with the error handling logic of the code under test.

This is actually the trick refereced earlier. By breaking the body of the loop into a separate method, we only have to use this trick once rather than on each of the process job tests.

  def test_process_calls_process_one_job_in_a_loop
    q = F3DQueue.new
    flexmock(q).should_receive(:process_one_job).times(10)
    flexmock(q).should_receive(:process_one_job).and_return { throw :done }

    assert_throws(:done) do 
      q.process
    end
  end

To get this to pass, we implement the process method as follows:

  def process
    loop do
      process_one_job
    end
  end

Threading Issues

Finally we need to make sure a thread is started. Here is another place I changed the code to make testing easier. The original code base started a thread in the initializer of the object. This means that every F3DQueue object ran in its own thread. This would means every test would have to deal with multithread issues. Yuck!

I changed the code so that a thread is started only when explicitly calling the start method. I like this better for real object anyways. Although it is an extra step, it gives you more control about when the threads are started. If you really want to start a thread at object creation, you can just say:

queue = F3DQueue.new.start

Since I really don’t want to start a Thread in the test (I just want to make sure that the Thread.new method is called), I mock out Thread.new so that it must be called once and when called will execute the given block.

I then mock out the process method to that it must be called once. The combination of these two mocks will ensure that start will start a new thread that calls process.

And finally, I ensure that the return value of start will be the queue object. This makes sure that the F3DQueue.new.start idiom works.

  def test_start_will_start_a_process_thread
    q = F3DQueue.new
    flexmock("thread", Thread).should_receive(:new).with(Proc).once.
      and_return { |block| block.call }
    flexmock(q).should_receive(:process).once

    return_value = q.start
    assert_equal q, return_value
  end

And is is the little start method that needed to be written for the test. The Thread.new line is moved from the initialize method to here.

  def start
    Thread.new do process end
    self
  end

Here’s our final test run:

9 tests, 7 assertions, 0 failures, 0 errors

Code Coverage

We know that TDD gives pretty code code coverage stats out of the box. How did our “Comment-out First” approach do with regards to code coverage?

Here is the RCov report:

+----------------------------------------------------+-------+-------+--------+
|                  File                              | Lines |  LOC  |  COV   |
+----------------------------------------------------+-------+-------+--------+
|AutoF3D.rb                                          |     5 |     2 | 100.0% |
|f3dqueue.rb                                         |    82 |    53 | 100.0% |
|test_f3dqueue.rb                                    |   100 |    76 | 100.0% |
+----------------------------------------------------+-------+-------+--------+
|Total                                               |   187 |   131 | 100.0% |
+----------------------------------------------------+-------+-------+--------+
100.0%   3 file(s)   187 Lines   131 LOC

Wow! 100% on the first try.

Final Code Samples

You can find the final versions of the F3DQueue object and its tests here:

Future Directions

Now that the F3DQueue object is well testing, it is time to take a step back and think about the overall design of the class. There are a couple of things that stick out in my mind about this code.

(1) First Item

We did a lot of mocking on the F3DQueue object itself while it was being testing. Although a valid technique, you must be careful so that you don’t end up just testing your own mocks. What it does indicate is that the object you are testing might be trying to do too many things. Perhaps the class needs to be broken up into small classes, or perhaps some functionality needs to move into other classes.

With this in mind, the run method seems to know an awful lot about the workings of an Auto3D job object. It seems a bit out of place. Why don’t we move the run method to the job itself. Moving run into the Auto3D job object would allow us to write the following code fragment (in the process_one_job method):

...
  job = remove_from_queue
    begin
      if job
        job.run                       # was: run job
        sleep SERVER_RECOVERY_TIME
...

Now, our queue class is one method shorter and is just concerned with the scheduling of the jobs and not the details of running the job itself. This is good …

Except for the following little piece of code, which leads us into the second thing that bothered me:

  def add_to_queue(modelLoc, params, gridFile)
    autoF3D = AutoF3D.new(modelLoc, params, gridFile)
    @queue.push autoF3D
  end

(2) Second Item

Here we have direct knowledge of the AutoF3D class. If we remove the reference to AutoF3D, then our queue will suddenly become much more general, and usable in situations where we might want to process a different kind of job.

I would recommend changing the above code to:

  def add_to_queue(job)
    @queue.push job
  end

This does mean that adding a job to the queue would now have to create the job object explicitly. So, instead of:

   queue.add_to_queue(loc, param, grid)        

you would have to write:

   queue.add_to_queue(new AutoF3D.new(loc, param, grid))

If you don’t like to manually create an AutoF3D object all the time (and I don’t), then the following solution is an easy fix to that:

  queue = F3DQueue.new
  def queue.add_job(loc, params, grid)
    add_to_queue(AutoF3D.new(loc, params, grid))
  end

The more traditionally minded of us might want to just subclass the F3DQueue class and add the add_job method in the subclass rather than in the singleton class. That works too. Either way, it is easy to do.

Recap

I hope this was useful for you. Here is a recap of some of the important ideas from this exercise:

  • Comment-First is not a bad way to handle legacy code.
  • Test scenarios, not methods. Note that I didn’t just pick a method in F3DQueue and write a single test for it. I choose scenarios that would exercise different sections of the code base. Start with the simple (e.g. a Job that Doesn’t Fail). Then pick increasing harder scenarios (e.g. “a Job that Fails Once”, “a Job that Fails Multiple Times”).
  • Don’t be afraid to refactor to make testing easier. Breaking out process_one_job was a great idea that not only made testing much easier, but made the code easier to read.
  • The “Use Symbols as Cheap Mocks” is an idea I stole from Stu Halloway in his “Refactoring of the Week” presentation. If a method takes arguments that you don’t want to deal with, try passing in symbols. If the arguments aren’t used, the symbols work great. If an argument is actually used, the error message will identify the symbol at fault. At that point, just replace the symbol with the appropriate mock. This technique save you lots of time and makes the tests easier to read.
  • If you want to break out of an infinite loop in the code under test, throw a symbol from your mocks and catch it in your test. This generally doesn’t interfere with any exception handling code in your code under test.
  • Always take a step back and look for ways of improving the code. A well tested module is fairly easy to change with confidence. Don’t be afraid to improve things.

More Samples

Do you have a bit of code that you are having trouble testing? If so, go ahead and send it to me. If your code is interesting enough, I’ll take a look at it and post the results here (so don’t send anything you aren’t willing to see published in this blog). I can’t look at everything, but I’ll try to find some interesting examples.


blog comments powered by Disqus

 

Formatted: 04-Jul-09 04:02
Feedback: jim@weirichhouse.org