I wrote this article on Design by Contract and Unit Testing to go along with the EiffelUnit documentation. I'm no longer maintaining EiffelUnit (I recommend using Gobo Eiffel Test instead), but thought the article was worth publishing independently. Hence this blog entry.The article was written three years ago, so I added a postscript to the conclusions that reflects my current experience.
I've been using Eiffel for almost three years, and have been exploring some Extreme Programming ideas for the past 6 months. In particular, I've been using XP's "Test-First" desgin process in conjunction with Eiffel. Here are some of my thoughts on how unit testing and Design by Contract fit together. The presentation is in the form of a question/answer session. If you have comments on this topic, I would be glad to here them at jweirich@one.net.
The first part of this page covers Design by Contract for those who are new to the idea. Eiffel programmers may want to skip directly to the section on Unit testing. Those whose time is limited may want to jump directly to the conclusions.
In short, Design by Contract starts by specifying the conditions under which it is legal to call a method. It is the responsibility of the client software to ensure these conditions (called preconditions) are met.
Given that the preconditions are met, the method in the supplier software guarantees that certion other conditions will be true when the method returns. These are called postcondition, and are the responsibility of the supplier code in ensure.
Ok, seriously. Contracts are inherited by child classes. If a child class overrides a method in a parent class, it needs to make sure the method's new implementation conforms to the contract of the parent class. By conforming, we mean that the child class is not allowed to tighten the preconditions of the parent class, or is it allowed to loosen the postconditions. We remember this by using the slogan...
Require no more, promise no less.
By the way, Eiffel automatically handles preconditions and postconditions in child classes without any need to copy and paste them from the parent.
Note that the checking (assertions) are merely a way of verifying that everyone is playing by the rules. We should be able to remove the assertions without effecting program correctness. In fact, Eiffel allows you to specify several different levels of assertion checking.
From a Design by Contract perspective, this is not testing preconditions. Preconditions define when it is legal to invoke a particular method. According to DbC, calling a method when its preconditions are not established results in undefined behavior. How can you test for undefined behavior?
What the unit tester has really done is changed the contract on the tested code. They replaced the existing precondition with True (the universal precondition ... methods with a precondition of True may be called at any time). They also changed the postcondition to a stronger, more complicated one.
push (value: INTEGER)
require
not is_full
ensure
count = old count + 1
top = value
When you do robustness checking in a unit test, you generally transform the contract into something slightly different. This is often called "defensive programming".
push (value: INTEGER)
-- Note: no precondition
ensure
success: (not old is_full) implies
((count = old count + 1) and
(top = value))
failure: (old is_full) implies
((count = old count) and
(top = old top))
error_set: old is_full = error
In this version, we added an error flag. We could have also specified an exception should be thrown.
Unfortunately, that's not necessarily the case.
-- Design by Contract version
if not stack.is_full then
stack.push (value)
else
-- handle full stack
end
|
-- Defensive Programming Version
stack.push (value)
if stack.error then
-- handle full stack
end
|
Defensive programming can also hide problems in the code. By ignoring bad values, bad data can propagate further in a program.
By enabling contract precondition checking, we can immediately detect interface irregularities whenever they happen, whether in function tests or unit tests.
Although they are checking the same kind of behavior, their focus is different. DbC postconditions are general and answer the question of how a method responds under all possible legal conditions. Unit tests focus on how a method responds under certain, specific situations specified in the test.
Because DbC postconditions are general, they are able to check every invocation of the method for the proper behavior. They can be run during unit testing, but they also are in effect during functional testing (and normal program execution, assuming they aren't disabled), checking each and every invocation of the method. In contrast, unit tests only check a limited number of specific invocations.
Because they are more general, DbC postconditions are more difficult to write. In fact, sometimes the postcondition of a method can be more complicated that the actual code of the body.
resize (min_index, max_index: INTEGER)
-- Resize to bounds `min_index' and `max_index'.
-- Do not lose any item whose index is in both
-- `lower..upper' and `min_index..max_index'.
require
valid_bounds: min_index <= max_index + 1
ensure
new_lower: lower = min_index
new_upper: upper = max_index
default_if_too_small:
min_index < old lower implies subarray
(min_index, max_index.min (old lower - 1)).all_default
default_if_too_large:
max_index > old upper implies subarray
(min_index.max (old upper + 1), max_index).all_default
stable_in_intersection:
subarray ((min_index.max (old lower)).min(old upper + 1),
(max_index.min (old upper)).max(old lower - 1)).same_items
(old subarray ((min_index.max (lower)).min(upper + 1),
(max_index.min (upper)).max(lower - 1)))
The number of lines in the test case is actually longer that the postcondition above. But each individual test is simple and direct, and easy to verify that it is correct.
array: ARRAY[INTEGER]
test_same_range is
do
array := <<1, 2, 3>>;
array.resize (1,3)
assert_array (1, 3, <<1, 2, 3>>, array)
end
test_included_range is
do
array := <<1, 2, 3>>;
array.resize (1,1)
assert_array (1, 1, <<1>>, array)
end
test_big_range is
do
array := <<1, 2, 3>>;
array.resize (0, 5)
assert_array (0, 5, <<0, 1, 2, 3, 0, 0>>, array)
end
test_high_overlap is
do
array := <<1, 2, 3>>;
array.resize (2, 5)
assert_array (2, 5, <<2, 3, 0, 0>>, array)
end
test_low_overlap is
do
array := <<1, 2, 3>>;
array.resize (-1, 1)
assert_array (-1, 1, <<0, 0, 1>>, array)
end
test_no_overlap is
do
array := <<1, 2, 3>>;
array.resize (5, 7)
assert_array (5, 7, <<0, 0, 0>>, array)
end
Note for non-Eiffel programmers: The form <<1, 2>> is a manifest array of two elements.
I'm glad you asked. Consider the push and pop methods on a STACK. push is fairly easy to specify using DbC...
push (value: INTEGER)
ensure
count = old count + 1
top = value
But pop is more difficult.
pop
ensure
count = old count - 1
-- What is the value of `top'
What is the value of the top of the stack after pop completes? Well, it depends upon the history of pushing and popping leading up to a given call to pop. The logic required to write a postcondition specifying the value of pop
Unit tests don't suffer the same problem. Since we are only testing a specific scenario, we know what the value of the top of the stack should be.
test_push_and_pop is
do
stack.push(5)
assert_equal ("push5", 5, stack.top)
stack.push(123)
assert_equal ("push123", 123, stack.top)
stack.pop
assert_equal ("pop5", 5, stack.top)
stack.pop
assert ("empty", stack.is_empty)
end
We see that although unit tests and Design by Contract assertions have a lot of overlap, but each has a different focus and different strong points. DbC preconditions are useful for checking inter-module correctness. Unit testing and DbC postconditions validate individual module behavior, with unit testing focusing on specific situations an DbC postconditions focusing on general behavior.
So, what does this mean in practice? I find that when I am doing Eiffel, I tend to specify preconditions along with a good set of unit tests. If the postconditions are obvious and easy to write, I will include them (mainly because postcondition are included in the short form view of a class). If the postconditions are difficult to specify in general, I might include them in comment form for the short view.
When I'm using something Java, or some other language with weaker support for Design by Contract, I definitely would use unit tests for checking individual modules. Depending on the availability of tools (e.g. some kind of assert mechanism), I would still do some level of precondition specification.
At the time of writing the original text (sometime in 2000), I was doing a lot of DbC and Eiffel programming and had just started using XP style unit tests. Today (sometime in 2003) I am no longer writing Eiffel code and doing very little "formal" DbC. I find that I still think of interfaces in terms of contracts with pre/post conditions, but instead using Eiffel-like assertions to define the pre/post-condition assertions, I now use unit tests to drive and define the shape of the software.
Although the unit tests have a different focus than assertions (as noted in the article), I find that they are quite suitable for my needs and rarely find it necessary to supplement the coce with any additional pre/post-condition assertions.