Speaking to Management: Coverage Reporting

Test coverage is important. In this post, I will reflect about communication issues with test coverage.

The word coverage has a different meaning in testing than in daily language. In daily language, it’s referring to something that can be covered and hidden completely, and if you hide under a cover, it will usually mean that we can’t see you. If you put a cover on something, the cover will keep things out.

Test coverage works more like a fishing net. Testing will catch bugs if used properly, but some (small) fish, water, plankton etc. will always pass through. Some nets have holes through which large fish can escape.

What’s so interesting about coverage?

When your manager asks you about test coverage, she probably does so because she seeks confidence that the software works sufficiently well to proceed to the next iteration or phase in the project.

Seeking confidence about something is a good project management principle. After all: If you’re confident about something, you are so because you don’t need to worry about it. Not having to worry about something means that you don’t have to spend your time on it, and project managers always have a gazillion other things that need their attention.

The word is the bug

So if confidence comes out of test coverage, then why is it that it managers often misunderstand us when we talk about coverage?

Well, the word actually means something else in daily language than it does when we use it in testing. So the word causes a communication “bug” when it’s misunderstood or misused.

We need to fix that bug, but how? Should we teach project managers the ”right” meaning of the word? We could send them to a testing conferences, ask them to take a testing course, or give them books to read.

That might work, but it wouldn’t solve the fundamental communication problem. It will move higher up in the organisational hierarchy.

An educated manager will have the same problem, not being able to make her peers and managers understand what ”test coverage” means. After all, not everyone in the organisation can be testing experts!

STOP mentioning coverage

A good rule of thumb in communication is: When your communication is likely to be misinterpreted, don’t communicate.

I, as a tester knows what test coverage means and more importantly what it does not mean, but I cannot expect others to understand it. Thus, if I use the word, I will probably be misunderstood. A simple solution to this is to stop using the word. So I won’t say sentences like: Our testing has covered some functionality.

The thing I can say is: We have carried out these tests and we found that.

This will work well until someone asks you to relate your testing to the business critical functionality: Ok, then then tell me, how much of this important functionality do your tests cover?

Uh oh!

Stay in the Testing Arena – or be careful

American circuses have enormous tents and two, three or even four arenas with different acts happening at the same time. A project is always going on in different arenas as well: For example we might have a product owner arena, a development arena, a test arena, and a business implementation arena.

Some people play in several arenas: I think most testers have at some point in the career made the mistake of telling a developer how to code. Likewise, we can probably all agree that there’s nothing more annoying than a developer telling a tester how to test.

Confidence belongs in the product owner arena, not in testing. This is because testing is about qualifying and identifying business risks, and since confidence does not equal absence of risks, it’s very hard for us to talk about confidence. And coverage.

This doesn’t mean you can’t move to another arena.

You can indeed look at things from the product owners perspective, that’s perfectly ok! Just make sure you know that you are doing it and why you are doing it: You are leaving your testing arena to help your product owner make a decision. Use safe-language, when you do.

Talk facts and feelings

Confidence is fundamentally a feeling, not a measurable artefact. It’s something that you can develop, but it can also be communicated: Look confident, express confidence, talk about the good stuff, and people around you will start feeling confident.

Look in-confident, express worry, talk about problems, and people around you will start feeling worried.

Testers always develop feelings about the product we’re testing, and we can communicate these feelings.

I know two basic strategies in any type of test result communication:

  • Suggest a conclusion first, then tell’m what you’ve done
  • Give them all the dirty details first, then help your manager conclude

Which communication strategy you pick should depend on the context, e.g. your relation with the manager. If everything looks pretty much as-expected (whether that’s good or bad), your manager has trust in you, and you have good knowledge of the business risks, then I wouldn’t worry too much about serving the conclusion first, and then offer details later mostly to make sure you and your manager doesn’t misunderstand each other. And that nobody will later be able to claim that you kept silent about something.

But if something is way off, or your manager doesn’t trust you (or you don’t trust her), peoples lives may be at stake, or you just have no idear what’s happening, then stick to the details – do not conclude. And that, I think, implies not using the term ”testing coverage”.

An illustration of the resource vs coverage problem

The illustration below is taken from an old book, I’m reading (*):

Illustration from: Holger Paaskesen: Vi lærer for livet?

Fig. 3 shows a desert land which will be cultivated by irrigation, i.e. the artificial application of water. Fig. 4 shows the amount of fertile soil available. Now, the farmer can decide to spread the soil all over the area, by which the layer of fertile soil will be so thin that nothing will grow in any part of the land. That is not a good plan and all the work involved will be fruitless.

But there’s an alternative: The soil can be spread over a section of the land, for example the area marked in fig. 3. This way the layer of soil will be thick enough to ensure that there will be exuberant growth and good utilization in the smaller area. This is obviously a much better plan.

This scenario not only applies to farming: It illustrates a problem we often face in testing, where the amount of functionality being developed is much larger than the what we can cover in a decent way. It is my experience that it is always better to focus testing on sections of the system than to try to check everything: There will be areas of the system which will be left untested, but what you test, you will cover well.

As a decision maker, I’d much rather have in depth knowledge about parts of the system, than to know very little about everyting. It will give me much better foundation for making good business decisions.

*) The book is Holger Paaskesen: “Vi lærer for livet?”  from 1968. It’s English title would be “We learn for life?” and it’s a book about school education.

Covering test coverage

Rolf Østergaard @rolfostergaard suggested on twitter when I posted my previous blog that instead of counting defects and tests we take a look on test coverage. Certainly!

Mathematically, coverage relates the size of an area fully contained in another area, relative to the size of that other area. We could calculate the water coverage of the Earth or even how much of a floor a carpet could cover. Coverage can be expressed as a percentage.

But coverage is also a qualitative term. For example a book can cover a subject, or a piece of clothing can give proper (or improper!) body coverage.

So what is test coverage? Well, the term is often used to somehow describe how much of a system’s functionality is covered by testing.

Numbers are powerful and popular with some people, so a quantified coverage number would be nice to have. One such number is code coverage, which is calculated by dividing the number of code lines which have been executed at least once by to the total number of code lines in a program.

Another measurement relies on business requirements for the system being registered and numbered, and tests mapped to the requirements which they test. A suite of tests can then be said to cover a certain amount of requirements.

Numbers can hint something interesting. E.g. if your unit tests exercise only 10% of the code and it tends to be the same 10% on all of them, the chances are that something important will be missing from the unit tests. Or you could even have a lot of dead legacy code. This would be similar if you found that you actually only tested functionality in a few of the documented business requirements: Could the not-covered requirements be just noise?

No matter what, a coverage number can only give hints. It cannot give certanity.

Let’s imagine we can make a drawing of the functionality of a system; like a map. Everything on the map would be intended functionality, everything outside would be unaccepted. Let’s make another simplification and imagine for the moment that the map is the system, not just an image of it. Here is an example of such a simple system:

Drawing of a system being tested. Some tests verify valid functionality of the system, other tests verify that there are not functions in the system which should not be there. But tests are points.

The blue area is the system. The red spots are checks carried out as part of testing. Some of the checks are within the system, others are outside it. The ones within are expected to pass, the ones outside are expected to fail.

Note that there is no way to calculate the test coverage of this imaginative system. Firstly, because the area outside the system is infinite and we can’t calculate the coverage of an infinite area. Secondly, because the checks don’t have an area – they are merely points – so any coverage calculation will be infinitesimal.

Ah, you may argue, my tests aren’t composed of points but are scripts: They are linear!

Actually, a script is not a linear entity, it’s just a connected sequence of verification points, but even if it was linear, it wouldnt’ have an area: Lines are one-dimensional.

But my system is not a continous entity, it is quantified and consists only of the features listed in the requirement document.

Well that’s an interesting point.

The problem is that considering only documented requirements will never consider all functionality. Think about the 2.2250738585072012e-308 problem in Java string to float conversion. I’m certain there are no requirement documents on systems implemented in Java, which actually listed this number as being a specifically valid (or invalid) entry in input fields or on external integrations. The documents probably just said the system should accept floats for certain fields. However a program which stops responding because it enters an infinite loop is obviously not acceptible.

A requirement document is always incomplete. It describes how you hope the system will work, yet there’s more to a system than can be explicitly described by a requirements document.

Thus any testing relying explicitly on documented requirements cannot be complete – or have a correctly calculated coverage.

My message to Rolf Østergaard is this: If a tester makes a coverage analysis of what he has done, remember that no matter how the coverage is measured, any quantified value will only give hints about the testing. And if he reports 100% coverage and looks satisfied, I strongly suggest you start looking into what kind of testing he has actually done. It will probably be flawed.

Intelligent testing assists those who are responsible for quality in finding out how a system is actually working, it doesn’t assure quality.

Thanks to Darren McMillan for helpful review of this post.