-
Notifications
You must be signed in to change notification settings - Fork 9
Description
A common failure mode for the CHT is high CPU usage by CouchDB.
High CPU usage can be monitored in Watchdog using Node Exporter with Prometheus or any other infrastructure monitoring tool. However, without any additional information, it’s difficult to determine the root cause of performance issues. Currently, it’s often necessary to infer what CouchDB is doing from its logs, HAProxy logs, or other indirect sources, rather than directly observing the operations consuming CPU resources.
There are several tools that can inspect the Erlang process list, such as etop or recon, which could be useful. It may also be possible to export relevant metrics directly to Prometheus.
Regardless of the implementation, CHT Watchdog should provide enough information to answer the common question:
"What is this CHT deployment doing that requires so much CPU?"