Problem/Motivation
More information about text formats
Text format
Filtered HTML
HideIssue summary & relationships
We have broadly three types of issues committed against core:
1. Features, introducing new functionality.
2. Tasks, changing or refactoring existing functionality
3. Bug fixes, fixing a bug with existing functionality, usually but not always without significant changes to behaviour.
For features we always require test coverage, although in general the bigger the feature being added, the harder it can be to determine whether the test coverage is sufficient.
For tasks the amount of new or changed test coverage depends on the kind of change being made. However refactoring that doesn't break tests or requires only minor changes to tests is generally assumed to have coverage.
For bug fixes we assume that if there's a bug, it must always be possible to write a test that fails without the bug fix being applied, so more or less every bug fix requires an accompanying test - there are rare occasions where multiple issues are being worked on at once, and test coverage is added in dependent issues.
While it's not the intention, this means that the test requirements for fixing a bug, are more strict than they are for refactoring or adding new features.
The types of bug we have are obviously very disparate, from single character typos in exception message placeholders to upgrade path data loss and fatal errors.
A test written for a specific bug fix will ensure it doesn't regress over time, and if the code path was otherwise completely untested it may prevent other regressions in the future, but in general it's less likely that a bug fix causes a regression than a task or new feature.
Bug fixes are sometimes complex issues where multiple code paths need testing, but sometimes they're straightforward developer mistakes that weren't caught in review, and we'd never have thought to have added explicit test coverage if they'd been spotted during patch review instead of after commit. See case studies below.
Bug fixes are also often the place where new or irregular core contributors interact with the core queue most, because they're contributing back fixes from client sites (where by definition they're manually testing the bugfixes they contribute, and will often have them applied via composer.patches.json). In general there is a high incentive to make improvements to the bug fix itself, but often neither knowledge nor incentive to write comprehensive test coverage - if you have the patch in composer.patches.json it's already fixed for you and apart from having to re-roll every year or so that might be the last time you think about it. Even if you do 8 re-rolls of the same 1-5 line bugfix, it will take you less time than 8 hours writing an rewriting a 50-line test and responding to reviews - and you might get reviews after you've already moved on from the project that needed the bug fix.
On top of this, it can be very hard to know where to add test coverage in core due to the sheer number of tests that we have, and unclear responsibilities and boundaries for when to add new assertions to an existing test, new methods on an existing test class, new test classes altogether etc. This represents a high barrier to contribution where there is an easy opt-out and quite significant consequences to the rest of the community for those issues not being fixed (possibly not for the specific issue, but in aggregate across hundreds of small bugfixes).
The is a further consequence that often the tests we require for small bugfixes aren't well scoped at all - they're one-off tests for very specific developer mistakes or simulating very specific contrib or custom code path interaction edge cases, not well structured generic tests of functionality (like the JSON:API/REST coverage) or high-impact coverage like upgrade path tests. By being extremely specific, they don't necessarily provide good coverage of the area they're testing, or they might be very tied to implementation and result in maintenance burden over time. This often leaves us with a choice of committing poor quality tests to get the bug fix applied, or ever-expanding the scope of a tiny issue to introduce better test coverage it was never the issue's fault didn't exist in the first place.
And with phpstan, test coverage is sometimes not the best cure anyway - we can look for general types of developer mistakes across core and contrib, vs. testing one specific case of it we already fixed. This is only just starting, but it gives us better options than we had in 2010.
Case study 1
#3223302: Changed Method label() in Taxonomy Terms results in Field/Name Changes is a still open issue, with a one-line bug fix, and ~100 lines of test coverage. The bug was opened in July 2021, a merge request was opened in November 2021, it was marked RTBC in January 2022, it was marked needs work in April 2022 for one change to the bugfix, and to add test coverage, tests were added, it was then marked needs work again for documentation issues with the test coverage.
The test added ensures that the specific developer mistake (calling ::label() in a getEntityLabelFieldValue() method) won't happen again for taxonomy terms. However, it does not add test coverage for other entity types in core - this would be out of scope for that issue, even though they're equally prone to a similar regression being introduced. The bug can't be reproduced with just core, you need contrib or custom code altering the term label to run into it.
Further, it's not even clear that test coverage is the best way to stop this happening again, the most likely cause would be a new entity type copying and pasting this pattern (in contrib or core) and making the same mistake, and for that phpstan seems more appropriate.
So to fix a one-line bug, we're adding a 100-line test that doesn't address the issue for other entity types or stop new entity types making the same mistake, and the bug still isn't fixed in any version of core.
My suggestion in this case would be to open a new issue 'Document and/or add test coverage/code style rules to ensure that entity label field getters don't call ::label()), and just commit the one-liner, but our current policy works against that kind of solution because it would be committing the bugfix 'without tests'.
Case study 2
#2561639: Wrong placeholder for exception shown when new fields are created during module installation fixes a typo in an exception message resulting in a placeholder not being replaced. It was opened in September 2015 with a patch. Marked needs work for tests in November 2015. Tests were written in the same month. It has gone between needs work, needs review, and RTBC for seven years due to various issues with the test coverage. It is still open, and we have neither the bug fix nor the tests in core seven years later. This is for a one character bugfix for an error message.
Heuristics
One risk of adopting this approach is it increases the likelihood of discussions on issues about whether they need test coverage or not, which has the potential to delay commits and burn people out too. To try to prevent this, we can start with a reasonably strict set of heuristics to apply to issues, unless the answer to the majority of these is 'yes' or there is some other specific reason not to include test coverage, we'd require tests by default.
- Are there clear steps to reproduce on the issue?
- Is it easy to verify via manual testing that the bug is fixed?
- Is it a 'trivial' patch with small, easy to understand changes?
- Is the code being changed in self-contained/@internal code that we don't expect contrib modules to interact with significantly? (plugins, controllers etc.)
- If the bug was to regress, is this likely to impact sites that were applying the patch or working around the bug in some other way, that would have removed those mitigations once the bug was fixed in core?
- Is the bug fix achieved without adding new, untested, code paths?
- To add test coverage, would it need an explicit 'regression' test that mainly prevents us reverting the patch rather than improving coverage in general?
- If the test coverage was deferrred to a follow-up, would it be easy for someone who didn't work on the original bug report to pick it up?
- Does the issue expose a general lack of test coverage for the specific subsystem, and if so would it be better to add generic test coverage for that subsystem in a separate issue so that the test coverage can be added in a maintainable way, rather than a regression test?
Original report:
Writing tests is a good approach. It helps to reduce the number of bugs as we add new features. However for patches that do not bring any new functionality but fix bugs in existing features requiring tests is a destructive policy because it slows down the process of fixing bugs in Drupal core.
It is a quite typical case, when a patch for a bug is submitted and even reviewed but then it gets declined by core committer because of lack of tests. After this the issue may get stuck for a very long time (months/years). And there is a great chance the patch will never be commited. Furthermore it also happens that a fix that was already commited is reverted because of this testing policy. Which is kind of crazy because reverting a commit that fixes a bug is like deliberately commiting this bug.
Another point here is that writing tests may require different level of skills and area of expertise than fixing a bug. Drupal testing sub-system is rather complicated. At this moment it consists of 5 (?) different types of tests. On the other hand the patches for many bugs are trivial and could be processed in novice issues.
Proposed resolution
Allow trivial bug reports to be committed without test coverage, when generic test coverage scoped at a wider level, or CS/rector rules would be more appropriate.
For non-trivial or low level bugfixes, continue to require test coverage on the issue since in these cases we need be able to demonstrate the bug is fixed without manual testing.
Remaining tasks
Discuss and get it implemented.