Do not unit test bugs
Before getting to the topic of the title let’s have a simple programming sample. On the programming task I will demonstrate some bad coding style and based on that it will be easier for me to explain why the same style is bad in unit tests. Well, now that I wrote this sentence this seems to be a obvious statement. Why something would be good in unit testing when this is not good in programming. One thing is that it is not always the way like that, and the other is that the same mistake may not be so obvious when we create unit tests.
1.1. Demo task
The demo task is very simple. Let’s write a class to decide if an integer number > 1 is prime. The algorithm is simple. Check all the numbers starting with 2 until the square root of the number. If the number is not prime we will find a number that divides the number integer times, if we do not find a divisor then the number is prime.
public class PrimeDecider {
final int number;
public PrimeDecider(int number) {
this.number = number;
}
public boolean isPrime() {
for (int n = 2; n * n < number; n++) {
if (number % n == 0) {
return false;
}
}
return true;
}
}
The unit test is
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;
import org.junit.Test;
public class PrimDeciderTest {
@Test
public void sample_2_IsPrime() {
PrimeDecider decider = new PrimeDecider(2);
boolean itIsPrime = decider.isPrime();
assertTrue(itIsPrime);
}
@Test
public void sample_17_IsPrime() {
PrimeDecider decider = new PrimeDecider(17);
boolean itIsPrime = decider.isPrime();
assertTrue(itIsPrime);
}
@Test
public void sample_10_IsNotPrime() {
PrimeDecider decider = new PrimeDecider(10);
boolean itIsPrime = decider.isPrime();
assertFalse(itIsPrime);
}
}
This is a great test, readable, some copy paste and most of all it gives us 100% code coverage. Believe me:
It is all green. There can go nothing wrong! We happy.
1.2. Bug appears
One day, however, somebody gets the strange idea to test if 9 is prime. Believe it or not, our program says that 9 is prime. So the tester (or, if you are not lucky a customer) opens a bug ticket:
BGTCKT17645329-KT The method Prime does not give the correct answer for the numbers that are multiplications of three. For example it results true for an object that represents 9.
Then comes the tedious work of bug fixing. What a joy it is usually. First of all you overcome your feeling that whispers into your ear that "the customer is stupid". Obviously the customer is stupid because he wanted to use the class to test the number 9 it was never meant to be… hahh!!! and because the bug description is simply wrong. There is no method Prime
! And the code correctly detects for example the number 3 (which is a multiplication of 3 itself) is prime. And it also detect correctly that 6 and 12 are not prime number. So how does a customer dare to craft such a bug report. Thoughts in your brain like that may help you calm down but do not help business, which is the first priority for a professional like you.
After calming down you admit that the code does not really work for the number 9 and you start to debug and fix it. First there comes a unit test that fails. That is the way we have to do TDD:
@Test
public void demonstrationOf_BGTCKT17645329() {
PrimeDecider decider = new PrimeDecider(9);
boolean itIsPrime = decider.isPrime();
assertFalse(itIsPrime);
}
and you deliver the fix:
public boolean isPrime() {
if (number == 9)
return false;
for (int n = 2; n * n < number; n++) {
if (number % n == 0) {
return false;
}
}
return true;
}
I am just kidding!!!… or not
Actually I have seen fixes like that in real production code. When you are under time pressure and since life is finite you are, you may come up with a fix like that even when you know what the proper solution would be. In this case it is as simple as inserting a =
in front of the <
sign in the loop condition to test that the number is actually not the square of a prime number. Essentially the code
for (int n = 2; n * n =&lt; number; n++) {
would be nice.
In real production cases this may be a real and huge refactoring and if these special cases appear rarely since the code is usually used for numbers less than 25 then this fix is (may be) commercially OK.
1.3. Realistic fix for the bug
Be more realistic and assume that you realize that problem is not limited to the number 9 but to all square numbers and you apply the fix:
public class PrimeDecider {
final int number;
public PrimeDecider(int number) {
this.number = number;
}
public boolean isPrime() {
if (isASquareNumber(number))
return false;
for (int n = 2; n * n &lt; number; n++) {
if (number % n == 0) {
return false;
}
}
return true;
}
private boolean isASquareNumber(int number) {
double d = Math.sqrt(number);
return Math.floor(d) == d;
}
}
This is ugly, but it works. Real word code with god classes containing a few thousand lines do not get any better than this even after refactoring.
Are we finished? Not really. Lets look at the unit tests again. It documents that the code
sample 2 is prime
sample 17 is prime
sample 10 is not prime
demonstration of BGTCKT17645329
Thats is not really meaningful, especially the last line. The bug was reported (in addition to some false statement) that the number 9 is not handled properly. But the actual bug was that the program did not handle properly the numbers that were squares of prime numbers. If you know ITIL the first one is the incident and the second one is the problem. We created a unit test for the incident and it was good we did that. It helped the debugging. But when we identified the problem, before applying the fix we did not create one to test the fix for the problem. This was not real TDD and because there was a unit test for the incident but we did not create it to test the fix.
The proper test would have a name something like
some sample square number is not prime
(with the appropriate camel casing in the method name) and it would have some square numbers, like 9, 25, 36 as test data.
1.4. Conclusion
When fixing bug be careful with TDD. You may apply it wrong. TDD says to write the unit test before you code. The unit test you write will define what you want to code. This is not the unit test that demonstrate the bug. You can use that as a tool to debug and find the root cause. But that is not the part of TDD. When you know what to write, no matter how eager you are to fix the code: do write the unit test that will test the functionality that you are going to write.
This is what I wanted to imply (in an attention catching way) in the title: write a unit test for the functionality or functionality change that fixes the bug instead of the bug.
Comments imported from Wordpress
Gábor Schermann 2015-02-05 11:43:57
The difficulity is always to find the real domain that you want to test. In this case (primes) you should have tested the factors: 0 factor (the '1'), 1 factor (primes), 2 factors (separate cases for the same number (square numbers) or different numbers), 3 factors (all combinations, including 3rd power), etc. Then, if you receive a bug (e.g '9'), you could categorize it, and detect that your code is not working for that category. You may define other categories (e.g. 4th power tests) during the bug report analysis.
hron84 2015-02-04 16:34:13
The operator =< is wrong, as i know it is ⇐. The opposite operator is >=, but the equal sign should be the second, if i remember correctly.
Peter Verhas 2015-02-04 16:45:04
You are right. That would have been revealed by the compiler even before the unit test phase :-)
Martin Grajcar 2015-02-10 05:06:34
In case of such a small domain like
int
, exhaustive testing makes sense (assuming you can verify the result by other means). But this is another story.The other thing is pseudo-random testing. For example, generate two random factors greater than one, assure that the multiplication can’t overflow, and assert that the product is not a prime. It’s not as nice as one number per test, but it covers many more cases for just a bit computer time (and a minimum amount of code).
Peter Verhas 2015-02-10 10:35:56
The major problem with pseudo-random testing is that it can not be reproduced. We usually like tests reproducible. This is not a dogmatic issue though. There is no statement saying "don’t use random tests because they are not reproducible". There is a cost associated with the non-reproducibility of the tests and if it smaller than the advantage they give on the other side of the scale then you can go for it.
For the case, your suggested pseudo random tests are very well suiting and I wish it was that easy to create test data automatically for real life commercial tests.
Comments
Please leave your comments using Disqus, or just press one of the happy faces. If for any reason you do not want to leave a comment here, you can still create a Github ticket.