What is code coverage and what does it mean? Is 100% code coverage all you need?
Code coverage is a measure of how much code has been executed during testing: the idea is that your customers should not be the first people to execute your code! Higher coverage suggests more comprehensive tests and a lower chance of bugs existing in the software. I use the word ‘suggests’ as we shall see that not all coverage is equal and even complete coverage may not be all you think it is.
A mathematical definition of coverage is:
\biggl(\frac{Number\space of\space entities\space executed}{Total\space number\space of\space entities}\biggr)\times 100\%
The ‘entities’ are things whose coverage can be measured, for example statements, branches, function calls etc. These entities can be broadly classified into Structural and Architectural
- Structural
- Code that can be executed – i.e. within a function
- Architectural
- Relationships between functions and shared data
An analogy is useful here: a publisher has given you an advance and you’ve written your first novel and you want to make sure it makes sense
- Checking the grammar, punctuation and spelling is like Static Analysis: we won’t cover that here
- Making sure sentences and paragraphs make sense is like Structural Coverage
- Making sure the story makes sense, the chapters are in the right order, the character names are consistent is like Architectural Coverage
We’ll revisit this analogy later when we have 100% coverage to see if this is enough to satisfy our publisher!
Why do we need Code Coverage?
Because our industry has standards (DO178, ISO26262 and so on) and they say we need code coverage. But this is a poor answer! Really, code coverage helps measure testing completeness and helps improve software quality.
How can we measure how much of our code is used during testing? What code entities do we consider for code coverage? At the simplest level we can look at statements.
Statement Coverage
Have we executed each statement in our code at least once – this is statement coverage. But is this enough coverage? Consider the code:
if (condition)
normalize (c);
foo (c);
If I get 100% statement coverage, I’m done, right?
We can write a single test case that will execute all three statements and we would have 100% statement coverage: is our code fully tested? No, we haven’t tested the case (or at least we have no evidence we haven’t) where the condition is false. Does the correct operation of the function foo
depend on the function normalize
being called?
We need to consider the program flow – we need to look at branch coverage.
Branch Coverage
Consider the following snippet:
if(a || b || c)
normalize(c);
foo(c);
To get 100% branch coverage here we need two test cases: one where the condition is true and the other where it is false: we need to execute all paths through the code.
If I get 100% branch coverage, I’m done, right?
Well, no, because in C and C++ there is short-circuit evaluation: in the above condition if a
is true, then it doesn’t matter what b
and c
are, the overall condition is true: so they do not need to be evaluated. Thus during testing, we can cover both the true and false part of if
without ever evaluating condition c
. But when the code is deployed, it may be that both a
and b
are false, so c
must be evaluated and now the customer is running code that hasn’t been tested.
We need to evaluate all of the sub-conditions in a condition – this is MC/DC coverage.
Multi Condition/Decision Coverage
MC/DC coverage means we have to check that each sub-condition can independently affect the outcome of the decision. This means picking a sub-condition and keeping all the other sub-conditions the same, varying the chosen sub-condition between true and false and observing the entire if statement switching between true and false. So for each sub-condition we need two tests: one for false and one for true. But it turns out that a test for one condition can also be used as a test for a different decision: so we don’t need to create all the tests that a truth table of the decision contains.
Note that 100% MC/DC coverage gives 100% branch coverage, which implies 100% statement coverage. The highest safety integrity levels mandate this level.
If I get 100% MC/DC coverage, I’m done, right? I stop testing here?
No! We have executed all the code in all the functions: we have no evidence the right functions have been called and called in the right order
Function Coverage
A function has function coverage if it is called during test execution (so it is either 100% or 0%). Function call coverage is a measure of a function making all the function calls within it. These types of coverage are architectural coverage – they measure the coverage of the software design. As such, they are normally conducted at system test when all the software is present: you run a test to test some piece of functionality and the functions that are called should be the ones that your design says will be called. It may be possible for the system to work correctly – i.e. produces the right outputs for given inputs even if the expected functions aren’t called: this would be detected by function/function call coverage and indicates that the implementation of the code does not correctly match the design.
OK, so now I’ve got 100% function and function call coverage, I’m done, right? Not yet…
Data and Control Coupling Coverage
Data and control coupling coverage measure how components interact with each other through shared data and function calls. It is another form of architectural coverage that measures how the different components interact with each other. A data couple is a data item that can be accessed by more than one component. A control couple is a function in one component called from another component. The definition of a ‘component’ is the description of the tester, but it typically matches the design. Coupling coverage is used to verify the code has been implemented according to the design.
Also, this has benefits when designing modular software as it ensures the modules are accessed only via the correct interfaces. For example, the OSI model has 7 layers and each layer should only communicate with the layers on either side of it: skipping a layer is not allowed. Most software is designed with similar layers with a hardware layer at the bottom. Making sure that this isn’t directly accessed by higher levels (and only through the lowest layer) makes it easier to port the code to new hardware.
I must have enough coverage now!
Right, I see: now I have 100% Statement, branch, MC/DC, function, function call, data and control coupling coverage, surely I’m done, there can’t be anything left in the code to cover!
Let’s return to our first novel analogy: we’ve checked the spelling and grammar (static analysis), we’ve checked the sentences and paragraphs are correct and make sense (structural coverage), we’ve checked that the chapters are in the right order and the story makes sense (architectural coverage). But the novel is a Victorian romance and our publisher was expecting a modern-day crime thriller. We have utterly failed! It doesn’t matter how good our novel is, how intricate the storyline, how engaging the characters, how the plot develops; it simply doesn’t meet the requirements.
Many tools can automatically generate tests to get structure code coverage by examining the code and setting input values to cover all the paths (and some can also set expected values by executing the code and setting expected results from the actual results). What does this prove?
- We have a tool that can work out input values to traverse code paths
- The code didn’t crash with these input values
- The code does what the code says i.e. our compiler works
- And not much else!
So obtaining coverage on it’s own could be an automated task that doesn’t tell us if our code is doing what it should be.
Requirements Coverage
When test inputs and expect values are set based on the requirements then the code coverage has a meaning: we are testing the functionality of the code and measuring how much code is executed. We can also measure how many of the requirements we have tested and get requirements coverage: this ensures that the code does what it is supposed to do.
Structural and architectural coverage can show code that is never executed during testing, but it cannot detect absent or incomplete code (ie missing or incomplete features) – requirements coverage should catch this.
Measuring requirements coverage will:
- Tell us if we have got test(s) for each requirement?
- Prove traceability between software requirements, test cases, and code coverage
- Provide visibility to ensure requirements are being tested and the results of the tests
- Satisfies traceability requirements of regulated industry standards
Right, I now have 100% structural, architectural and requirements coverage, surely I’m done now?
Yes, you can release your code now!
The coverage needs to be constantly measured throughout the code lifecycle. Indeed measuring it through development is useful as it shows how the testing is progressing as the code is being written. If the coverage percentage increases then the testing is catching up with development. But if the coverage decreases then new features are being added without being tested, which increases technical debt.
Video: Learn more about Code Coverage
00:39 What is Code Coverage?
02:59 Why do we need Code Coverage?
04:44 Sorts of Code Coverage – Statement Coverage
05:54 Branch Coverage
07:13 Modified Condition/Decision Coverage (MC/DC)
09:15 Function Coverage
10:54 Data and Control Coupling
12:22 How much Code Coverage is enough?
13:17 I have 100% Code Coverage, I’m done, right?