|
This is a fun example of using FlexMock
Andrew Sweeney Asks:
Andrew Sweeney emailed me with the following question:
I am currently working on a ruby project in which I think
flexmock would be a good fit for unit testing. I have read the
documentation and gone over the examples however fail to wrap my head
around how to apply flexmock to my own app. I was hoping that you
could give me some guidence and get me started or point me in the
right direction.
You can find his original source code here.
I thought his problem was interesting enough to write it up as an
example of using FlexMock. Andrew and his mentor, Bil
Kleb gave
permission for me to reproduce the code in my blog. The F3DQueue
class is part of a Computational Fluid
Dynamics project
(http://fun3d.larc.nasa.gov) at NASA.
Quick Code Review
The F3DQueue class is small, so there’s not a lot of code we need to
wade through. We see it uses a second class named AutoF3D, but the
only clues we have to what AutoF3D might do are the four method calls
on the “job” object in the run method.
It looks like the main interface to the queue object is the
add_to_queue method. There is a thread started that pulls jobs
(i.e. AutoF3D objects) from the queue and processes them in turn.
There is some server delays built into the system. I presume that
Computational Fluid Dynamics is, ummm, computationally complex and the
delays are just there to make sure the workload does eat up all the
CPU time on the server.
Starting Testing
When writing new code, I always like to approach it in a Test-First
manner. Because I won’t write solution code without a test that
forces me to write it, I have a high confidence that the code is well
covered with tests.
Unfortunately, dealing with legacy code means that the code is already
written and the test-first approach won’t work. That’s ok, I have a
little trick that I use. Just comment out the bodies of all the
methods in the class you are about to test. Then write the tests that
force you to uncomment the code. Just uncomment only enought to get
the tests to pass, don’t uncomment anything you don’t have to. You
have enough tests when all the code has been uncommented. The
technique is almost as good as doing real test-first.
The Commented Out Version
Here
is the code base as I started the test.
An Existence Test
I almost always start out with an existence test. Existence tests
basically prove the proper files are included and the object can be
created. Normally I delete these after a few tests have been written.
But I left this one in for an example.
def test_initial_conditions
q = F3DQueue.new
assert_not_nil q
end
Nothing really exciting here. Let’s move on …
Proving FIFO Queue Order
The first thing I want to prove is that items put into the queue are
removed in FIFO order. Since add_to_queue creates a AutoF3D object,
I mock out the new method on the class object and tell FlexMock to
expect new to be called twice. Once with :a, :b, and :c as
parameters, then again with :x, :y, :z paramters. Each invocation of
new will return a different symbol (:first and :second) so we can
easily test the items are pulled off the queue in FIFO order.
Notice that I pass in simple symbols for the arguments to
add_to_queue. Our code doesn’t interpret the values of the
arguments, they are merely passed directly to the AutoF3D constructor.
All we do is verify that the AutoF3D (mocked) constructor does indeed
receive the arguments we pass in.
Here’s the test:
def test_adding_to_queue_is_removed_in_fifo_order
flexmock(AutoF3D).should_receive(:new).once.with(:a, :b, :c).and_return(:first).ordered
flexmock(AutoF3D).should_receive(:new).once.with(:x, :y, :z).and_return(:second).ordered
q = F3DQueue.new
q.add_to_queue(:a, :b, :c)
q.add_to_queue(:x, :y, :z)
assert_equal :first, q.remove_from_queue
assert_equal :second, q.remove_from_queue
end
This test caused three changes. First, the add_to_queue method
needed lines uncommented:
def add_to_queue(modelLoc, params, gridFile)
autoF3D = AutoF3D.new(modelLoc, params, gridFile)
@queue.push autoF3D
# $log.info 'Request added to queue'
end
(Notice I didn’t uncomment the log. The logger is not needed to pass
the test, and doesn’t contribute to the actual functionality of the
method. I will not be testing the logger in the for the purposes of
this article.)
Also the remove_from_queue needed its body uncommented:
def remove_from_queue
@queue.pop
end
And finally, the initializer code needed to create the queue array:
def initialize
@queue = []
# Thread.new{ process }
end
Notice that the Thread.new line is left commented. We will deal
with that in a bit.
So now we run the test:
$ ruby test_f3dqueue.rb
Started
F.
Finished in 0.010184 seconds.
1) Failure:
test_adding_to_queue_is_removed_in_fifo_order(TestF3DQueue) [test_f3dqueue.rb:23]:
<:first> expected but was
<:second>.
2 tests, 2 assertions, 1 failures, 0 errors
Oops! This test uncovered the first bug. The code as written has
stack behavior (i.e. LIFO). The naming seems to indicate that we want FIFO.
No problem. That’s an easy fix.
def remove_from_queue
@queue.shift
end
Now the tests run clean:
$ ruby test_f3dqueue.rb
Started
..
Finished in 0.001925 seconds.
2 tests, 3 assertions, 0 failures, 0 errors
Proving that Running a Job Works
Now when I run a job, I need to show that the proper four methods are
called once each and in the proper order. This is very straight
forward using FlexMock.
def test_running_a_job_will_call_the_right_stuff_in_the_right_order
job = flexmock("job")
job.should_receive(:generate_geometry_and_grid).once.ordered
job.should_receive(:partition_grid_and_initialize_flow).once.ordered
job.should_receive(:run_flow_solver).once.ordered
job.should_receive(:post_process_solution).once.ordered
q = F3DQueue.new
q.run(job)
end
Uncommenting the body of run is all that is needed here:
def run( job )
# $log.info 'Request being processed'
job.generate_geometry_and_grid
# $log.info 'Created Geometry'
job.partition_grid_and_initialize_flow
# $log.info 'Partitioned Grid'
job.run_flow_solver
# $log.info 'Flow Solver Completed'
job.post_process_solution
# $log.info 'Post process Completed'
# $log.info 'Request completed'
end
Test are now showing:
3 tests, 3 assertions, 0 failures, 0 errors
Processing an Empty Queue
Ok, now it gets interesting. I want to show that attempting to
process a job when the queue is empty will cause the process to sleep
for the check queue interval.
This is one spot where I changed the code to make it easier to test.
It is difficult to test endless loops in unit tests (it tends to make
the tests run a bit long), so I broke out the logic for a single
pass through the loop into a method called process_one_job. We can
then test this logic without dealing with the looping at the same
time.
Note: It is possible to test endless loops and an example will be
given below. But it is slightly tricky and this allows us to
concentrate on proving the logic.
If there are no jobs to be processed, then all the code should do is
sleep for a particular amount of time. We will locally mock out the
sleep method on the queue object and insist that it will be called
exactly once with the expected interval.
def test_processing_with_no_jobs_will_sleep_the_check_interval
q = F3DQueue.new
flexmock(q).should_receive(:sleep).once.with(F3DQueue::CHECK_QUEUE_INTERVAL)
q.process_one_job
end
Here is process_one_job with just two lines uncommented so that the
test will pass.
def process_one_job
# execution_attempts = 0
job = remove_from_queue
# begin
# if job
# run job
# execution_attempts = 0
# sleep SERVER_RECOVERY_TIME
# else
sleep CHECK_QUEUE_INTERVAL
# end
# rescue
# $log.warn 'An error occurred during execution'
# $log.warn $ERROR_INFO
# $log.debug $ERROR_POSITION
# sleep SERVER_RECOVERY_TIME
# if execution_attempts > MAX_EXECUTION_ATTEMPTS
# $log.error 'Too many failed execution_attempts: aborting'
# raise
# else
# execution_attempts += 1
# retry
# end
# end
end
There’s a lot of code still left commented in that method. Now we
need a test to force us to uncomment more code.
Handling a Single Job
Ok, now what happens when a single job is in the queue. We will
assume the happy path (i.e. no exceptions) so we expect run to be
called with the queued object, and then a sleep with the recovery
interval.
A couple of things to note. First, we mock out AutoF3D again so that
when we request something added to the queue, we control what kind of
object is returned. We could return a mock object and then mock out
the four methods that run will be calling.
However, I chose a slightly different approach. AutoF3D is mocked so
that it returns a simple symbol. Then I mock out the run method to
do nothing (but it is expected to be called once). This is slightly
controversial because I am actually mocking a method on the object
under test. But the run method is fairly simple, and we know that
run works because of our previous test, so in the end we get clearer
and simpler code.
Also note that the run and sleep methods mocks are ordered. This
means run will be called first, then sleep.
def test_processing_with_a_single_job_will_run_the_job_and_pause_for_recovery
q = F3DQueue.new
flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
flexmock(q).should_receive(:run).once.with(:job).ordered
flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
q.add_to_queue(:a, :b, :c)
q.process_one_job
end
Now we get to uncomment even more lines in process_one_job.
def process_one_job
# execution_attempts = 0
job = remove_from_queue
# begin
if job
run job
# execution_attempts = 0
sleep SERVER_RECOVERY_TIME
else
sleep CHECK_QUEUE_INTERVAL
end
# rescue
# $log.warn 'An error occurred during execution'
# $log.warn $ERROR_INFO
# $log.debug $ERROR_POSITION
# sleep SERVER_RECOVERY_TIME
# if execution_attempts > MAX_EXECUTION_ATTEMPTS
# $log.error 'Too many failed execution_attempts: aborting'
# raise
# else
# execution_attempts += 1
# retry
# end
# end
end
That just leaves the error handling code to be uncommented. So that will be next.
Handling a Job With Errors
Now we want to test the case where processing a job will return an
exception. This test exercise the exception recovery code in the
original code base. The technique is similar to the last test, but
this time we specify two mock calls for run. The first time run
will return an exception. The second time it is called, it will
complete normally.
Notice that we have ordered run and sleep so that they interleave
execution with each other.
def test_if_a_job_fails_retry_after_recovery_time
q = F3DQueue.new
flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
flexmock(q).should_receive(:run).once.with(:job).and_raise(RuntimeError).ordered
flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
flexmock(q).should_receive(:run).once.with(:job).ordered
flexmock(q).should_receive(:sleep).once.with(F3DQueue::SERVER_RECOVERY_TIME).ordered
q.add_to_queue(:a, :b, :c)
q.process_one_job
end
I was showing this test code to one of my coworkers and they were a
little surprised that the second expectation on run didn’t override
the first expectation. FlexMock is explicitly designed to allow you
to stack expectations like this. When searching for an expectation
during mocking, FlexMock will use the first one matching one if finds.
When an expectation has been used its designated number of times (in
the above test, the once method designates that the expectation
should only be used once), FlexMock will begin to use matching
expectations that are defined later.
The upshot is this is that it is easy to define mock behavior for
multiple calls to the same method.
Here’s the latest process_one_job method with some more lines
uncommented. We are getting close to the end with this one.
def process_one_job
# execution_attempts = 0
job = remove_from_queue
begin
if job
run job
# execution_attempts = 0
sleep SERVER_RECOVERY_TIME
else
sleep CHECK_QUEUE_INTERVAL
end
rescue
# $log.warn 'An error occurred during execution'
# $log.warn $ERROR_INFO
# $log.debug $ERROR_POSITION
sleep SERVER_RECOVERY_TIME
# if execution_attempts > MAX_EXECUTION_ATTEMPTS
# $log.error 'Too many failed execution_attempts: aborting'
# raise
# else
# execution_attempts += 1
retry
# end
end
end
Processing Jobs that Continually Fail
Finally we test the case where the job will continually raise an
exception until the error recovery code gives up and passes the
exception on to the caller. I didn’t bother ordering the
run/sleep calls here, making it easy to just specify that each are
called four times. I believe that the previous test adequately
specified interleaving.
I used a RuntimeError for my testing. If you have a specific
error in mind, you might want to test explicitly for it.
Generally raising the most general error you intend to handle is a
good way of testing the boundry conditions on your rescue clause.
def test_too_many_failures_will_pass_along_exception
q = F3DQueue.new
flexmock(AutoF3D).should_receive(:new).once.and_return(:job)
flexmock(q).should_receive(:run).with(:job).and_raise(RuntimeError.new("XYZZY")).times(4)
flexmock(q).should_receive(:sleep).with(F3DQueue::SERVER_RECOVERY_TIME).times(4)
q.add_to_queue(:a, :b, :c)
ex = assert_raise RuntimeError do
q.process_one_job
end
assert_equal "XYZZY", ex.message
end
Note that the exception needs to be raised four times. I suspect this
is a bug in the error handling logic. I left the logic as is and
just made sure the test will pass. The code base specifies a retry
count of “2”. This seems to imply that we try run twice, or perhaps
three times (if the initail attempt doesn’t count as a retry). In any
case, four times seems too much.
So, here is the code for process_one_job with most of its lines
uncommented.
def process_one_job
execution_attempts = 0
job = remove_from_queue
begin
if job
run job
# execution_attempts = 0
sleep SERVER_RECOVERY_TIME
else
sleep CHECK_QUEUE_INTERVAL
end
rescue
# $log.warn 'An error occurred during execution'
# $log.warn $ERROR_INFO
# $log.debug $ERROR_POSITION
sleep SERVER_RECOVERY_TIME
if execution_attempts > MAX_EXECUTION_ATTEMPTS
# $log.error 'Too many failed execution_attempts: aborting'
raise
else
execution_attempts += 1
retry
end
end
end
Again note that this test surfaced a (rather minor) bug. There is an
extra assignment that clears the execution attempt counter after a
successful run of job. Since a successful run will exit the loop,
clearing it has no effect (unless it is the sleep command that fails,
that would be an interesting test scenario).
Since we haven’t shown the test results for a while, here’s how we
stand at this point:
7 tests, 5 assertions, 0 failures, 0 errors
Processing Multiple Jobs
Now we know that we can handle a single job successfully. Now let’s
make sure that we can handle multiple jobs. Remember that we broke
process into two methods: process_one_job and a much shorter
process that will call process_one_job in a loop.
Here’s what the original process method is looking like at the
moment:
def process
# loop do
# end
end
We pulled out its guts and left the still commented loop there. We
haven’t even bothered to have it call process_one_job yet. So let’s
write a test that will force us to fix that.
We will just mock out process_one_job so that it must be called 10
times. On the eleventh call it throws a symbol that we catch in the
test. Throwing a symbol is the trick that breaks us out of the
infinite loop. By throwing a symbol (rather than raising an error),
we don’t interact with the error handling logic of the code under
test.
This is actually the trick refereced earlier. By breaking the body of
the loop into a separate method, we only have to use this trick once
rather than on each of the process job tests.
def test_process_calls_process_one_job_in_a_loop
q = F3DQueue.new
flexmock(q).should_receive(:process_one_job).times(10)
flexmock(q).should_receive(:process_one_job).and_return { throw :done }
assert_throws(:done) do
q.process
end
end
To get this to pass, we implement the process method as follows:
def process
loop do
process_one_job
end
end
Threading Issues
Finally we need to make sure a thread is started. Here is another
place I changed the code to make testing easier. The original
code base started a thread in the initializer of the object. This
means that every F3DQueue object ran in its own thread. This
would means every test would have to deal with multithread issues.
Yuck!
I changed the code so that a thread is started only when
explicitly calling the start method. I like this better for real
object anyways. Although it is an extra step, it gives you more
control about when the threads are started. If you really want to
start a thread at object creation, you can just say:
queue = F3DQueue.new.start
Since I really don’t want to start a Thread in the test (I just
want to make sure that the Thread.new method is called), I mock
out Thread.new so that it must be called once and when called will
execute the given block.
I then mock out the process method to that it must be called once.
The combination of these two mocks will ensure that start will
start a new thread that calls process.
And finally, I ensure that the return value of start will be the
queue object. This makes sure that the F3DQueue.new.start idiom
works.
def test_start_will_start_a_process_thread
q = F3DQueue.new
flexmock("thread", Thread).should_receive(:new).with(Proc).once.
and_return { |block| block.call }
flexmock(q).should_receive(:process).once
return_value = q.start
assert_equal q, return_value
end
And is is the little start method that needed to be written for the
test. The Thread.new line is moved from the initialize method to
here.
def start
Thread.new do process end
self
end
Here’s our final test run:
9 tests, 7 assertions, 0 failures, 0 errors
Code Coverage
We know that TDD gives pretty code code coverage stats out of the box.
How did our “Comment-out First” approach do with regards to code
coverage?
Here is the RCov report:
+----------------------------------------------------+-------+-------+--------+
| File | Lines | LOC | COV |
+----------------------------------------------------+-------+-------+--------+
|AutoF3D.rb | 5 | 2 | 100.0% |
|f3dqueue.rb | 82 | 53 | 100.0% |
|test_f3dqueue.rb | 100 | 76 | 100.0% |
+----------------------------------------------------+-------+-------+--------+
|Total | 187 | 131 | 100.0% |
+----------------------------------------------------+-------+-------+--------+
100.0% 3 file(s) 187 Lines 131 LOC
Wow! 100% on the first try.
Final Code Samples
You can find the final versions of the F3DQueue object and its tests here:
Future Directions
Now that the F3DQueue object is well testing, it is time to take a
step back and think about the overall design of the class. There are a couple of things
that stick out in my mind about this code.
(1) First Item
We did a lot of mocking on the F3DQueue object itself while it was
being testing. Although a valid technique, you must be careful so
that you don’t end up just testing your own mocks. What it does
indicate is that the object you are testing might be trying to do too
many things. Perhaps the class needs to be broken up into small
classes, or perhaps some functionality needs to move into other
classes.
With this in mind, the run method seems to know an awful lot about
the workings of an Auto3D job object. It seems a bit out of place.
Why don’t we move the run method to the job itself. Moving run
into the Auto3D job object would allow us to write the following code
fragment (in the process_one_job method):
...
job = remove_from_queue
begin
if job
job.run # was: run job
sleep SERVER_RECOVERY_TIME
...
Now, our queue class is one method shorter and is just concerned with
the scheduling of the jobs and not the details of running the job
itself. This is good …
Except for the following little piece of code, which leads us into the
second thing that bothered me:
def add_to_queue(modelLoc, params, gridFile)
autoF3D = AutoF3D.new(modelLoc, params, gridFile)
@queue.push autoF3D
end
(2) Second Item
Here we have direct knowledge of the AutoF3D class. If we remove the
reference to AutoF3D, then our queue will suddenly become much more
general, and usable in situations where we might want to process a
different kind of job.
I would recommend changing the above code to:
def add_to_queue(job)
@queue.push job
end
This does mean that adding a job to the queue would now have to create
the job object explicitly. So, instead of:
queue.add_to_queue(loc, param, grid)
you would have to write:
queue.add_to_queue(new AutoF3D.new(loc, param, grid))
If you don’t like to manually create an AutoF3D object all the time
(and I don’t), then the following solution is an easy fix to that:
queue = F3DQueue.new
def queue.add_job(loc, params, grid)
add_to_queue(AutoF3D.new(loc, params, grid))
end
The more traditionally minded of us might want to just subclass the
F3DQueue class and add the add_job method in the subclass rather
than in the singleton class. That works too. Either way, it is easy
to do.
Recap
I hope this was useful for you. Here is a recap of some of the
important ideas from this exercise:
- Comment-First is not a bad way to handle legacy code.
- Test scenarios, not methods. Note that I didn’t just pick a method
in F3DQueue and write a single test for it. I choose scenarios that
would exercise different sections of the code base. Start with the
simple (e.g. a Job that Doesn’t Fail). Then pick increasing harder
scenarios (e.g. “a Job that Fails Once”, “a Job that Fails Multiple
Times”).
- Don’t be afraid to refactor to make testing easier. Breaking out
process_one_job was a great idea that not only made testing much
easier, but made the code easier to read.
- The “Use Symbols as Cheap Mocks” is an idea I stole from Stu
Halloway in his “Refactoring of the Week” presentation. If a method
takes arguments that you don’t want to deal with, try passing in
symbols. If the arguments aren’t used, the symbols work great. If
an argument is actually used, the error message will identify the
symbol at fault. At that point, just replace the symbol with the
appropriate mock. This technique save you lots of time and makes
the tests easier to read.
- If you want to break out of an infinite loop in the code under test,
throw a symbol from your mocks and catch it in your test. This
generally doesn’t interfere with any exception handling code in your
code under test.
- Always take a step back and look for ways of improving the code. A
well tested module is fairly easy to change with confidence. Don’t
be afraid to improve things.
More Samples
Do you have a bit of code that you are having trouble testing? If so,
go ahead and send it to me. If your code is interesting enough, I’ll
take a look at it and post the results here (so don’t send anything
you aren’t willing to see published in this blog). I can’t look at
everything, but I’ll try to find some interesting examples.
|