Notify Honeybadger about errors after few occurances
… and check why 5600+ Rails engineers read also this
Notify Honeybadger about errors after few occurances
In our systems, there are those special types of errors that are transient. For those kinds of errors more often than not the time heals the wounds. Especially when those events occur in Sidekiq job that can be easily retried. So basically, we shouldn’t worry about them too much. But… What if there’s an error reported in Honeybadger that prematurely disturbs the team? Additionally, causing a loss of focus, which is expensive to regain, and unnecessary stress.
We don’t want to distracted for no reason
You’re guessing it right that something like this happened recently in the project that I work on. So we started thinking about how we could solve it to get notified, on our Slack channel, only about exceptions that need our attention.
Luckily in that period, we were together at Arkency microcamp and the one only Mirosław (thanks again!) arrived on the white horse (or rather his Kawasaki) and told us about the solution they have implemented in their project.
Sidekiq DeathHandler to the rescue
The IgnoredError class is a wrapper for an error that has the potential to be transient. And hence it may heal itself in the next couple of occurrences.
class IgnoredError < StandardError
def message
cause.inspect
end
end
We also have to add the IgnoredError to honeybadger’s configuration, to make sure it’s not reported by default.
# honeybadger.yml
exceptions:
ignore:
- IgnoredError
Now, lets see how it would be used in production code
rescue BankAccountNotFound => exception
raise IgnoredError
end
The error might occur for very different reasons. One of them is that the events (in the Event-Driven system) appeared in a different order than would be expected. And that’s okay. We don’t need to worry about that if we can simply retry the job and handle the transient error. Hence we can transform this exception to IgnoredError
.
In the happy-path scenario, after a retry (or few) is performed in Sidekiq, the job is successfully finished and the error will disappear.
But what if the error is not transient?
In that case, the job will be retried until it reaches a possible retries threshold and then it’ll call the death handler. Death handlers are called when all retries for a job have been exhausted and the job dies. Once it gets there, there’s information for us that the error most probably won’t resolve itself and that manual intervention is required. In our case, we also want to be notified.
class IgnoredErrorReportingDeathHandler
def call(job, exception)
if exception.is_a?(IgnoredError)
ErrorNotifier.notify(
exception.cause,
context: {
context: {
tags: "death_handler"
},
parameters: job,
component: job["class"]
}
)
end
end
end
The last step is to simply register the IgnoredErrorReportingDeathHandler
in Sidekiq config
config.death_handlers << IgnoredErrorReportingDeathHandler.new
And you’re good to go! Less distractions, better focus.