Skip to content

Add metrics for couchdb processes #171

@witash

Description

@witash

A common failure mode for the CHT is high CPU usage by CouchDB.

High CPU usage can be monitored in Watchdog using Node Exporter with Prometheus or any other infrastructure monitoring tool. However, without any additional information, it’s difficult to determine the root cause of performance issues. Currently, it’s often necessary to infer what CouchDB is doing from its logs, HAProxy logs, or other indirect sources, rather than directly observing the operations consuming CPU resources.

There are several tools that can inspect the Erlang process list, such as etop or recon, which could be useful. It may also be possible to export relevant metrics directly to Prometheus.

Regardless of the implementation, CHT Watchdog should provide enough information to answer the common question:

"What is this CHT deployment doing that requires so much CPU?"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions