Acceptance tests are not enough!

Acceptance testing is a key method in Agile. One way of defining acceptance tests are Gojko Adzic‘s ”Specification by example” paradigm which has gained quite a bit of momentum lately. I personally found it to be both refreshing and nice when I heard him present it at Agile Testing Days 2009, and I also found his book Bridging the Communication Gap a nice read.

Photo of Gojko Adzic demonstrating communication gaps in his keynote presentation at Agile Testing Days 2009
Gojko Adzic demonstrating communication gaps at Agile Testing Days 2009

I’m sceptical of the concept of acceptance testing. Not because verification of agreed functionality is not a good thing, but because it tends to shift attention to verification instead of exploration.

This will shift attention from testing to problem prevention. Is that bad, you ask? Isn’t it better to prevent problems than to discover them?

Well, most people think ”why didn’t I prevent this from happening?” when problems do happen. Feelings of regret are natural in that situation and that feeling can lead you into thinking you should improve your problem prevention. And maybe you should, but more examples aren’t going to do it!

Real testing is still necessary.

To explain why, I’ll consult one of the great early 20’th century mathematical philosophers: Kurt Gödel. In particular his first incompleteness theorem. It says that no consistent system of axioms whose theorems can be listed by an “effective procedure” is capable of proving all facts about the natural numbers.

What does this mean to us?

It means that we will never be able to list all things that can be done with this particular set of data.

A specification is a kind of listing of ”valid things to do” with data, thus Gödel’s theorem teaches us that there are infinitely more things to a system than any long list of requirements. This also applies when the requirements are listed as examples.

If you’re in the business to deliver products of only ”agreed quality” to a customer, you can be all right only verifying things which are explicity agreed. If something goes wrong you can always claim: ”It wasn’t in the specification!”

But if you’re striving for quality in a broader sense, verifying that the system works according to specifications is never going to be enough.

Gojko has made a good contribution to agile. Examples can be useful and efficient communication tools, and if they are used correctly they can help making users and other interested parties better aware of what’s going on on the other side. His contribution can help bridge a communication gap. It can also produce excellent input for automated unit tests.

Just don’t let it consume your precious testing time. The real testing goes far beyond verification of documented requirements!

If you want to learn more about this, I recommend you sign up for one of the Rapid Software Testing courses offered by James Bach and Michael Bolton.

Photo from Rapid Software Testing course in London November 2010 as offered by Electromind
Michael Bolton with one of the many interesting and challenging test exercises at the RST course
Reklamer

Covering test coverage

Rolf Østergaard @rolfostergaard suggested on twitter when I posted my previous blog that instead of counting defects and tests we take a look on test coverage. Certainly!

Mathematically, coverage relates the size of an area fully contained in another area, relative to the size of that other area. We could calculate the water coverage of the Earth or even how much of a floor a carpet could cover. Coverage can be expressed as a percentage.

But coverage is also a qualitative term. For example a book can cover a subject, or a piece of clothing can give proper (or improper!) body coverage.

So what is test coverage? Well, the term is often used to somehow describe how much of a system’s functionality is covered by testing.

Numbers are powerful and popular with some people, so a quantified coverage number would be nice to have. One such number is code coverage, which is calculated by dividing the number of code lines which have been executed at least once by to the total number of code lines in a program.

Another measurement relies on business requirements for the system being registered and numbered, and tests mapped to the requirements which they test. A suite of tests can then be said to cover a certain amount of requirements.

Numbers can hint something interesting. E.g. if your unit tests exercise only 10% of the code and it tends to be the same 10% on all of them, the chances are that something important will be missing from the unit tests. Or you could even have a lot of dead legacy code. This would be similar if you found that you actually only tested functionality in a few of the documented business requirements: Could the not-covered requirements be just noise?

No matter what, a coverage number can only give hints. It cannot give certanity.

Let’s imagine we can make a drawing of the functionality of a system; like a map. Everything on the map would be intended functionality, everything outside would be unaccepted. Let’s make another simplification and imagine for the moment that the map is the system, not just an image of it. Here is an example of such a simple system:

Drawing of a system being tested. Some tests verify valid functionality of the system, other tests verify that there are not functions in the system which should not be there. But tests are points.

The blue area is the system. The red spots are checks carried out as part of testing. Some of the checks are within the system, others are outside it. The ones within are expected to pass, the ones outside are expected to fail.

Note that there is no way to calculate the test coverage of this imaginative system. Firstly, because the area outside the system is infinite and we can’t calculate the coverage of an infinite area. Secondly, because the checks don’t have an area – they are merely points – so any coverage calculation will be infinitesimal.

Ah, you may argue, my tests aren’t composed of points but are scripts: They are linear!

Actually, a script is not a linear entity, it’s just a connected sequence of verification points, but even if it was linear, it wouldnt’ have an area: Lines are one-dimensional.

But my system is not a continous entity, it is quantified and consists only of the features listed in the requirement document.

Well that’s an interesting point.

The problem is that considering only documented requirements will never consider all functionality. Think about the 2.2250738585072012e-308 problem in Java string to float conversion. I’m certain there are no requirement documents on systems implemented in Java, which actually listed this number as being a specifically valid (or invalid) entry in input fields or on external integrations. The documents probably just said the system should accept floats for certain fields. However a program which stops responding because it enters an infinite loop is obviously not acceptible.

A requirement document is always incomplete. It describes how you hope the system will work, yet there’s more to a system than can be explicitly described by a requirements document.

Thus any testing relying explicitly on documented requirements cannot be complete – or have a correctly calculated coverage.

My message to Rolf Østergaard is this: If a tester makes a coverage analysis of what he has done, remember that no matter how the coverage is measured, any quantified value will only give hints about the testing. And if he reports 100% coverage and looks satisfied, I strongly suggest you start looking into what kind of testing he has actually done. It will probably be flawed.

Intelligent testing assists those who are responsible for quality in finding out how a system is actually working, it doesn’t assure quality.

Thanks to Darren McMillan for helpful review of this post.