Agent monitor / key for system interrupts like mpstat on linux

XMLWordPrintable

    • Type: New Feature Request
    • Resolution: Unresolved
    • Priority: Minor
    • None
    • Affects Version/s: None
    • Component/s: Agent (G), Server (S)
    • None

      We have many very high performance server that run out of interrupt cycles on the single CPU that handles interrupts for a NIC card. This does not show up on normal CPU stats which average all CPUs together, so on an 8 core machine we might see 12.5% CPU, but CPU1 is 100% on SI/Soft Interrupts and the network dies.

      I see we now can monitor Si in addition to user, idle, nice, system, but more importantly is to monitor max of all CPUs, so if a CPU has high SI above 80%, we can trigger on that.

      Thus, extend current CPU system.cpu.util[<cpu>,<type>,<mode>] with a CPU value of 'max' to get max value of any CPUs - from that I can get what I need.

      This type of thing is a critical feature for high-performance servers, including on VMs/clouds which much less efficient NIC interrupt handling, often dying at 10,000 packets/second or less.

      For now maybe we'll system.cpu.util[,softirq,avg1] * system.cpu.num[], and also track total interrupts but no way to that (mpstat -I SUM 1)

            Assignee:
            Alexei Vladishev
            Reporter:
            Steve mushero
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: