1. Introduction

Recently I modified a feature in Jamal. I kept the original functionality for backward compatibility, but I added a new feature. However, the use of the old feature is deprecated, and it will be removed in the next version. I also wrote in the document that version 3.0.0, which is somewhere in the future, will not support the old feature.

How can I ensure that the feature gets deleted in that release?

In this article, I describe what I did. It may not be the best solution. You may come up with better ideas, and you are very much welcome to do that in the comment section.

In the following chapters, I will dig a bit into what I changed, to give some background and then the tests I created. In the end, I will also tell you what I do not like in this solution.

During this way I will also reiterate the most important features of unit tests, which surely is discussed in many other places, but it does not hurt to repeat.

2. The feature

Jamal uses macros. After all, Jamal is a meta markup language with built-in and user defined macros, so that is a core feature. The macros are identified by a name, that contains [$_:a-zA-Z0-9] characters not starting with a number. This is fairly standard.

Some solutions import text from some source where the natural name of the source does not comform to this rule. To overcome this, there is a macro named macro that returns a temporary alias for the macro named irregularly. That way, instead of the syntactically incorrect

{this.cannot.be.a.macro}

can be used as

{{@macro [alias]this.cannot.be.a.macro}}

The macro macro can create aliases for user defined as well as for built-in macros. To get an alias for a built-in macro the option builtin has to be used.

{@{@macro [builtin alias]this.cannot.be.a.macro}}

The old version used a different format

{@{@macro [type="builtin" alias]this.cannot.be.a.macro}}

defining the type of the macro as a string. As Jamal was developed the parameter options can now be Java enum values. When you can select builtin or userdefined this is a better choice than using a string. The part type= is only noise with no extra information. It is evident what builtin or the default userdefined means.

The parameter option type will be deprecated in the next release (2.5.0), but should not be deleted. It will be removed in the release 3.0.0. Backward incompatibility has to be kept to minimal and is usually not allowed for minor version increase.

3. Problem Statement

I wrote in the documentation:

Note
The parop type with string parameters is supported for compatibility reasons only and is deprecated. It will be removed in release 3.0.0

How can I ensure that the feature gets deleted in that release? When we need something to be ensured, the best way is that we write a test about it. The tests run automatically for each build, and if the test fails, the build fails. It is more or less a trivial idea to have a test for the feature.

4. Designing the Test

The functionality of the test is fairly simple. It has to check the version of the Jamal library and if it 3.0.0 or higher than it has to check that the type parameter is not supported.

To do that it is simple: we have to process a macro with the type parop, and it has to throw an exception. Since Jama has a support library, you can easily write:

TestThat.theInput(
  "{@define a=yayy}{#ident {@define a=value of a}" +
  "{@macro [global type=\"user defined\"]a}}").throwsBadSyntax();

Jamal can also provide the running instance version calling Processor.jamalVersionString(), therefore, it is also not an issue. The whole test is

@Test
@DisplayName("Macro parop 'type' is deprecated and has to be removed in version 3 and later")
void testDeprecation() throws Exception {
    final var v = Processor.jamalVersionString();
    if (!v.startsWith("2")) {
          TestThat.theInput(
            "{@define a=yayy}{#ident {@define a=value of a}" +
            "{@macro [global type=\"user defined\"]a}}").throwsBadSyntax();
    }
}

That is the technical part, but I also want this unit test to be a "real" unit test. Is it possible?

Unit tests have to be descriptive, fast, isolated, repeatable, small, self-validating, maintainable, trustable, focused, thorough.

to name a few of the qualities.

Descriptive means that reading the code of the unit test makes it evident what the test does. Many times this feature is mentioned as readable. In the case of JUnit it is supported by the DisplayName annotation. Looking at the unit test above this is hardly a problem. We can tick this check box.

Fast means that the unit test should bot take too much time. This should also not be a problem. The macro processing does not use any io, it is as fast as it can be.

Isolated means that the test runs fine even if there is some bug in some part of the code the test is not responsible for. If this test is isolated or not is debatable. It is not isolated as it uses the test support library of Jamal, which indeed uses the processor.

One can argue that when this test runs, the processor is not under test and has to be considered as trusted code the same way as the mock library is trusted. This is very much true when we test a macro library, which is independent of the core Jamal is tested. A bit less when we run macros that are part of the Jamal library. In this case, there can also be bugs in the processor affecting this test.

It is a rare case, and the simplicity provided by avoiding the mock setup balances the cost of the possible bug in the processor. It is how Jamal is designed and how it supports macro unit testing. Also, this is a general engineering compromise between isolation and simplicity and has nothing to do with the fact that the problem is to test a future version. Let’s discuss this further sometimes in the future in a different article.

Repeatable means that the test should run the same way no matter how we execute it. It is isolated from the environment.

Small means that the unit test is short. It is very much related to the descriptive and maintainable quality. If a unit test is large, contains many lines, it is hardly readable and usually not easy to maintain. In that case, it is also a code smell that there is something wrong. Either the unit test is wrong in some sense or the code itself needing some refactoring to be testable.

In this case, the unit test is small. It is three lines of code, or five adjusted to printing.

Self-validating means that there is no need to check the output of the test manually. There can be no debate if the test passed or failed. The output of the test is either green or red. If it is red, then it can still be failed test or an error, which is also a kind of failure needing attending.

Maintainable means that whenever the code changes and the code in the unit test becomes invalid, it is easy to change. Let’s assume no matter how absurd it is that the syntax of the macro define changes. This macro is used in the unit test. If the change is so, then the unit test will fail, but not related to the tested feature. It also shows that this unit test is not properly isolated, but I already discussed that and I will in detail in a further article. The test, however, is maintainable because it is extremely easy to follow the imagined change in the syntax.

Of course, the syntax of such a centerpiece macro like define will not change backward incompatible. That would be bad product management, but let’s not derail: it was only a hypothetical example.

Trustable means that the test either passes or fails all the time independent of external conditions. It does not matter if it is a hot summer, or cold winter, it is sunny or rainy, the operating system patched to the latest security patch, replaced by the marketing department from Linux to Windows: the test will pass or fail.

There are cases when tests sometimes pass, and sometimes fail. There is a popular extension of the JUnit framework in the JUnit Pioneer project that allows you to run the same test multiple times till it does not fail. This is a totally wrong approach, and instead of being okay with having a test, sometimes passing, the developer should thrive for trustable test.

Do not take it wrong. A test that sometimes fails and sometimes passes can prove that the code is ok. It depends on the code, the feature tested and the test itself. For example, you can have a method that returns prime number to the number of seconds in current time modulo ten. A test can check that the method returns 5 and repeats every half second till it gets it. It is highly questionable what it proves when it passes, but failure does not mean that the code is wrong. The test has to be improved, but if it cannot, then it may be better to have it.

This test, I believe, is trustable. Can you prove that it is trustable? Questionable. I will discuss this issue in this article.

Focused means that the test checks one feature. If the test fails, it proves that one feature is faulty. Beginners many times put multiple features in one test. This makes it more difficult to see what feature is faulty when a test fails.

Being focused is also expressed many times saying that one test should have one assertion only. This is misinterpreted many times as one test should have one line of assertion code only. This is not the meaning of the "one assertion" rule.

You can check in an assertion statement the length of a list, then in the next assertions the individual elements. Technically, these are several assertion statements, but they compose one complex assertions.

The above test is focused. Do not mistake the two conditions. One is not an assertion, rather a prerequisite of the test. Other than that the test checks if the feature type is deprecated and removed in the future version.

Thorough means that the set of unit tests cover all relevant cases. A single unit test cannot prove that the code functions as expected. It can only prove that the code does not function as expected. On the other hand, the full set of unit tests gives a fairly good approach and approximation of the correctness of the code. It does not prove strictly to speak.

Looking at this single unit test as a set, I can say that it covers all relevant cases.

5. Trustability

As we said, the trustability of the example test is questionable. So here we question and answer that.

Turstability is a tricky feature of tests. If you are a QA person, you know that nothing is tursted unless it is tested. Testing the test is a great idea, after all that is your bread and butter, that is what you earn your money. Unfortunately, or not so "un" fortunately that is also a cost of testing. Testing the test is recursive and something having a cost attached to it must not run away. Therefore, we usually stop there and do not test the tests with other tests.

What developers do…​

I do not know what developers do. I know what I do.

When I create a test for a feature, I run the test before developing the feature. Then I develop the feature and then the test passes. This is some basic form of TDD, and I am not always that disciplined.

If some strange way the feature is already there when I develop the test, then I just remove the feature to see that the test fails. This is a manual test of the test.

How can we do that in our case? We have a test that checks that the type parameter is removed and a precondition. The issue is that the precondition needs a future release, which is not there yet. It will be there as soon as the release will be created or when time travel is invented, whichever comes first. ,n of the Jamal library to a future version and run the test. And that is exactly what I did. And it failed as it should.

Hurray…​

6. Takeaway

We discussed a simple problem and a simple technique to solve it. It would not deserve much conclusion. A simple trick.

However, if I look at it as an example of some out-of-the-box thinking, we can learn from it.

Unit test is a tool. There are some rules on how to use it, but these rules are not strict. They are not the ten commandments. They are there to help us and must not be followed dogmatically. We should understand why a unit test has to be descriptive, fast, isolated, repeatable, small, and so on. If we understand the reasons we can judge better if our tests conform to the rule and also when to make an exception, when is it acceptable to break some of the rules.


Comments

Please leave your comments using Disqus, or just press one of the happy faces. If for any reason you do not want to leave a comment here, you can still create a Github ticket.