It is impossible to troubleshoot problem generation and closing issues without resorting to developer methods:
- why was the problem created:
- why didn't global correlation rule apply and close the problem;
- why didn't recovery expression close the problem (it is not evaluated if problem expression fires - e.g. there is an overlap between problem and recovery expressions so new problem is created - see (8) in
- why wasn't the problem closed:
- error in regular expression (e.g. regex compiles but doesn't match the intended text);
- error in macro, user macro or LLD macro (wasn't expanded or not supported in this context, quoting problems, etc.);
- why didn't trigger level correlation close the problem;
- why didn't the global correlation kick in;
- error in formula;
When encountering these errors with a trapper item that I had created for testing I had to resort to debug level logs, GDB, looking into SQL dirrectly to find out the root causes of the issues above. This is not practical in production and is unfriendly to the users.
Hence it is proposed to trace decisions for each event as it passes through Zabbix and show the trace in frontend accordingly.