What is a regression test and what is regression testing?
A regression test is a test to ensure that you know your regression bugs.
Regression testing is a technique to identify the minimal set of regression tests that need to be executed to ensure that you know all your regression bugs.
What sounds quite simple is a rather complex task. This blog walks you from the start to the end. We start with some definitions, and then I’ll show you how it is done using a commercial MedTech project with more than 6000 clinical devices worldwide.
In the end, I’ve added a bonus paragraph about automated regression testing - and I highly suggest that one.
In the software testing world, there is a myriad of different definitions for the same “technique”. Always.
Most of them are the same just reflecting a different view, i.e. is testing to find bugs or is testing a technique to prove that functionality works as expected. We can’t close this gap but knowing that it exists is the first rule in understanding how it works and how people write about it. Let me give you my definition for a regression test:
Every test automatically becomes a regression test after its first execution.
There is nothing special about a regression test besides its timing. For now, we assume that testing, in general, is a technique used to prove that functionality works as specified. That means a SME/Tester/Test Engineer reads through a specification (or a set of specifications) and designs a test case based on it. You will have multiple test steps with stimuli, expected, and actual results.
If the software is written 100% according to specification the test case passes, and the testing of this specification is finished. (Keep in mind all of this is simplified. There are different design techniques for test cases etc. but that is not the topic of this blog)
We do this for all specifications and if all test cases are passed the software provides all the specified functionality. Version 1.0 can be released to our customers.
If you ask yourself how this works for agile environments: Exactly the same way.
We just don’t call it Version 1.0, we call it product increment instead. Because people are not used to version numbers like 55461254.0, we use product increments in the agile context and at some point, the product owner defines which items of the backlog make a version 1.0. The product increment that includes all these items then becomes version 1.0.
What we do in a healthy agile environment is to release a new version with every check-in. That is what a CD/CI pipeline means.
No, they do not need a change. What has been a test case to prove that the functionality works as expected is now a regression test case. But this is where a lot of people start to argue that regression testing is more, e.g. we need more test cases to cover for edge scenarios or to cover the “real user interaction”.
My answer to that: If all you did, in the beginning, is happy path testing don’t blame the definition, blame your laziness.
With the foundation set straight, we have the following learnings already
Let me back this up by the definition of what I call the source of all regression testing, the regression bug:
A regression bug is a bug that causes a feature that worked correctly to stop working after a certain event (system upgrade, system patching, daylight saving time switch, etc.). … Regression bugs are an annoying and painful phenomenon in the software development process, requiring a great deal of effort to find. (Source: springerprofessional)
To make all of this more concrete, I have created this timeline:
You can replace “Implement feature 2” in the timeline above with “implement bug fix X”, “refactor ABC”, …
We have established what a regression test is, let’s look at what regression testing means.
In a naive world, regression testing means executing all test cases for every change. For a perfect situation, we would cover 100% of all code paths with test cases and would execute 100% of these test cases before every release.
On a real-world project, you are resource bound, i.e. you need to have a budget and you don’t have 100% coverage of all code paths. Therefore, good regression testing is extremely complex. I do have -what I would call - the perfect solution but before we dive into this, let me give you an example of how regression testing works in a project I have been part of.
The first step is implementing the change and understanding the change. This includes being aware of all the refactoring work and small tweaks that were done in parts not directly affected by the primary code change.
While the developer doesn’t necessarily know the complete software and how it all works together (side note: I am an advocate of the theory that a developer has to fully understand the system and be able to use it) she/he will know which code areas were modified.
The first step, therefore, is to capture a description of the change in developer language, which means that other developers should be able to understand the change and its intention by looking at the code and reading the change description.
The second step is a potential impact analysis. The developer does not know the test cases and their coverage.
Therefore, the complete software is split into roughly 40 modules with corresponding sub-functionalities categorized in those modules. This allows the developer to list all the 40 modules that might be affected by this change. And yes, sometimes it is all.
This is where the testing kicks in. The tester knows the modules as well and knows which test cases cover which module. Here we have the first match. The developers know which code maps to which module, tester knows which module maps to which test.
By this simple mapping we have already reduced the candidates of test cases to be executed as regression tests for this change from 100% to something around 2.5% per affected module assuming everything is equally distributed in those modules). This brings us to 10% of all test cases if four modules are affected.
But that is if we would use modules only. A module is tested by a myriad of test cases, so the tester starts exploratory tests based on the impact assessment given by the developer.
This is where the core competence of “open communication” is a must. The developer and tester have an open ongoing discussion on these changes. While the tester explains what he is planning to test the developer provides input on which cases might be more relevant as she/he understand more of the test coverage.
And that, ladies, and gentlemen is how you identify the perfect regression test set: A team of developers and testers that are experts in their domain but also understand the domain of the other role. This reduced regression testing to roughly 10% of all test cases for a regression testing cycle.
How can you reach 100% regression tests? It sounds too good to be true, but it can be done with automation.
For the sake of completeness: Regression testing is not bound to be automated. The project I’ve described above is without ANY end-to-end automation.
I do miss TestResults.io in this project, nevertheless. Everything is manual here, so having to do the impact assessment is required right now. Having communication between developer and tester is required right now. And I love it, but I would love it even more if it would be kind of optional. Kind of like the cherry on the top. We would still do it because it also brings us together, but we wouldn’t need to rely on it.
Automating test cases from the beginning on a platform that keeps the automation maintainable and doesn’t act flaky is the perfect solution for effortless regression testing and therefore to a higher quality.
While I told you how we narrow down the number of test cases that need to be executed I would rather tell you we run all test cases all the time. That would give us time to do more exploratory tests and that is what your users do. They don’t read the specification they start to explore your product.
Automation eases the part of making sure that your product works as specified. Michael Bolton refers to this as checking instead of testing. Which frees up resources for real testing. Your users will love it.
Regression tests are normal tests and regression testing is a difficult challenge that you can either try to solve with a clever overlapping algorithm (remember code vs. test cases in modules) or really solve it by adding automation into the game.