-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Beats receivers currently use the same internal API for tracking as metrics as standalone Beats. This means that when monitoring is enabled, each Beats receiver has its own metrics registry that is published and/or scanned independently. The overhead and potential fragmentation of this approach is an increasing concern as we migrate to a component model with a separate receiver for every input. While the broader plan is to migrate metrics tracking to fully OTel-native APIs, we need a bridge for legacy components in the meantime.
Libbeat should supplement the existing input metrics API with the ability to expose a process-global table of metrics from all active inputs across all Beats receivers. It should be possible to make the data available on the /inputs endpoint as is done currently, but with one query giving all results. When the first receiver initializes its metrics via Libbeat, it should start a listener as currently, but when additional receivers do the same they should be added to the existing listener instead of creating an additional one. (Sharing the listener in this way should be controlled by the receiver's monitoring config.) The data should be in-place compatible with existing Filebeat inputs, and should be compatible with other beats and particularly Metricbeat to the degree possible without conflicting with Filebeat.
The implementation should try to simplify future migration work:
- This change should be implemented by maintaining a list of registries for active Beats receivers and aggregating all of them into a single response when queried, rather than logically linking the registries themselves or exposing the distinction to the inputs. As much as possible, inputs should not need to know about the overall process environment to report their metrics.
- The more Libbeat can separate the process of acquiring an input metrics object from the structure and semantics of the legacy metrics API, the easier it will be to start migrating individual inputs to a replacement without disruptions.