Dealing with randomly failing tests from a team perspective

One of the things that can negatively impact a team morale is random builds - builds where sometimes some tests are failing. Inspired by Martin Fowler’s article on Quarantine, in some of our projects we came up with a guideline how we can fix the problem as a team.

If a test fails randomly for more than 1 time, add to it to quarantine (consult the list of existing failures)
never kick the build without doing some action (quarantine, test fix)
if the build is red after your session of work, it’s your responsibility to fix it (feel free to ask for help if you have no time, but the initiative is yours). Whenever we say ‘you are responsible’, we mean that the whole team is responsible, but you’re the tracker, you take the initiative. It’s not your fault, but we need someone to track it and that seems to make most sense.
don’t push into the repo if you have no time to handle the build problems
never leave a red build after your session of work
if a build fails for not clear reason, find the reason and fix it
don’t push the code if the build is red
if you start your working session and the build is red, talk to others and fix it first, then start your task
if there’s really no other way to fix the build and no one to help, then at least kick the build

Dealing with randomly failing tests from a team perspective

You might also like