diff --git a/docs/modules/nifi/pages/usage_guide/monitoring.adoc b/docs/modules/nifi/pages/usage_guide/monitoring.adoc index 585df15a..3a093f87 100644 --- a/docs/modules/nifi/pages/usage_guide/monitoring.adoc +++ b/docs/modules/nifi/pages/usage_guide/monitoring.adoc @@ -2,8 +2,9 @@ :description: The Stackable Operator for Apache NiFi automatically configures NiFi to export Prometheus metrics. :k8s-job: https://kubernetes.io/docs/concepts/workloads/controllers/job/ :k8s-network-policies: https://kubernetes.io/docs/concepts/services-networking/network-policies/ +:prometheus-operator: https://prometheus-operator.dev/ -In November 2024, Apache NiFi released a new major version https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version2.0.0[`2.0.0`]. +In November 2024, Apache NiFi released a new major version https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version2.0.0[`2.0.0`,window=_blank]. The NiFi `2.0.0` release changed the way of exposing Prometheus metrics significantly. The following steps explain on how to expose Metrics in NiFi versions `1.x.x` and `2.x.x`. @@ -11,10 +12,10 @@ The following steps explain on how to expose Metrics in NiFi versions `1.x.x` an == Configure metrics in NiFi `1.x.x` For NiFi versions `1.x.x`, the operator automatically configures NiFi to export Prometheus metrics. -This is done by creating a {k8s-job}[Job] that connects to NiFi and configures a https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-prometheus-nar/1.26.0/org.apache.nifi.reporting.prometheus.PrometheusReportingTask/index.html[Prometheus Reporting Task]. +This is done by creating a {k8s-job}[Job,window=_blank] that connects to NiFi and configures a https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-prometheus-nar/1.26.0/org.apache.nifi.reporting.prometheus.PrometheusReportingTask/index.html[Prometheus Reporting Task,window=_blank]. IMPORTANT: Network access from the Job to NiFi is required. -If you are running a Kubernetes with restrictive {k8s-network-policies}[NetworkPolicies], make sure to allow access from the Job to NiFi. +If you are running a Kubernetes with restrictive {k8s-network-policies}[NetworkPolicies,window=_blank], make sure to allow access from the Job to NiFi. See xref:operators:monitoring.adoc[] for more details. @@ -34,111 +35,106 @@ spec: == Configure metrics in NiFi `2.x.x` -The Prometheus Reporting Task was removed in NiFi `2.x.x` in https://issues.apache.org/jira/browse/NIFI-13507[NIFI-13507]. -Metrics are now always exposed and can be scraped using the NiFi Pod FQDN and the HTTP path `/nifi-api/flow/metrics/prometheus`. +The Prometheus Reporting Task was removed in NiFi `2.x.x` in https://issues.apache.org/jira/browse/NIFI-13507[NIFI-13507,window=_blank]. +Metrics are now always exposed and can be scraped using the NiFi `metrics` Service and the HTTP path `/nifi-api/flow/metrics/prometheus`. -For a deployed single node NiFi cluster called `simple-nifi`, containing a rolegroup called `default`, the metrics endpoint is reachable under: +For a deployed NiFi cluster called `simple-nifi`, containing a rolegroup called `default`, the metrics endpoint is reachable under: ``` -https://simple-nifi-node-default-0.simple-nifi-node-default..svc.cluster.local:8443/nifi-api/flow/metrics/prometheus +https://simple-nifi-node-default-metrics..svc.cluster.local:8443/nifi-api/flow/metrics/prometheus ``` -IMPORTANT: If NiFi is configured to do any user authentication, requests to the metric endpoint must be authenticated and authorized. +NOTE: The above URL connects to one of the Pods, reachable through the specified Service, therefore scraping metrics produced by that Pod only. +To scrape metrics from a particular Pod, the FQDN of the Pod and the `headless` Service need to be used. For example: `\https://simple-nifi-node-default-0.simple-nifi-node-default-headless..svc.cluster.local:8443/nifi-api/flow/metrics/prometheus` -=== Authentication with NiFi `2.x.x` - -[IMPORTANT] -=== -The NiFi metrics endpoints are behind a strong authentication mechanism which require credentials for each individual pod. -=== +IMPORTANT: If NiFi is configured to do any user authentication, requests to the metrics endpoint must be authenticated and authorized. -To authenticate, you can use a bearer token created by your NiFi instance e.g. +=== Authentication with NiFi `2.x.x` -[source,bash] ----- -curl -X POST https://simple-nifi-node-default-0.simple-nifi-node-default..svc.cluster.local:8443/nifi-api/access/token -d 'username=&password=' -k ----- +To authenticate against the NiFi `2.x.x` API, you can configure mTLS between NiFi and the client calling NiFi. For more information about authentication between +Kubernetes Pods, check out the xref:home:secret-operator:index.adoc[Secret Operator documentation]. -where `-k` equals `verify=false` to allow self-signed certificates. The reply is your bearer token. +The following example illustrates the configuration of a Prometheus scraper for NiFi, using the aforementioned method of configuring mTLS +and utilizing the internally available `tls` xref:home:secret-operator:secretclass.adoc[SecretClass]. -The following example shows how to configure the Prometheus scraper to use the bearer token to authenticate against a NiFi pod. +To generate a client certificate signed by the `tls` SecretClass CA trusted in NiFi, add the following `volume` and `volumeMount` +to the Prometheus Pod. -[source,yaml] ----- ---- -authorization: <1> - type: Bearer - credentials: "" <2> -tls_config: - insecure_skip_verify: true -static_configs: - - targets: - - '..svc.cluster.local:8443' <3> -metrics_path: '/nifi-api/flow/metrics/prometheus' -scheme: https ----- -<1> Use the `authorization` property instead if the `basic_auth`. -<2> Add the previously obtained token here. -<3> Static targets only scrapes one pod. +IMPORTANT: If the {prometheus-operator}[Prometheus Operator,window=_blank] is used to deploy Prometheus, there is currently a known bug, which prevents adding an additional Volume containing annotations on the volumeClaimTemplate. The bug is tracked in the https://github.com/prometheus-operator/prometheus-operator/issues/7709[Prometheus Operator Issues,window=_blank]. The annotations are necessary to configure the behavior of the Secret Operator. As a current workaround, until the issue is resolved, one could deploy an additional Pod only responsible for creating a TLS certificate as a Secret, which then can be used by the ServiceMonitor. This workaround is illustrated in the https://github.com/stackabletech/demos/blob/main/stacks/monitoring[`monitoring` Stack,window=_blank]. -or use it in a NiFi secret which should look like [source,yaml] ---- --- -apiVersion: v1 -kind: Secret -metadata: - name: nifi-authorization-secret -type: Opaque -stringData: - nifi_token: "" +prometheus: <1> + prometheusSpec: + volumes: + - name: tls + ephemeral: + volumeClaimTemplate: + metadata: + annotations: + secrets.stackable.tech/class: tls # <2> + secrets.stackable.tech/scope: pod,service=prometheus-kube-prometheus-prometheus # <3> + spec: + storageClassName: secrets.stackable.tech + accessModes: + - ReadWriteOnce + resources: + requests: + storage: "1" + volumeMounts: + - name: tls + mountPath: /stackable/tls # <4> ---- +<1> This given configuration is set in the {prometheus-operator}docs/api-reference/api/#monitoring.coreos.com/v1.Prometheus[Prometheus resource,window=_blank] for the {prometheus-operator}[Prometheus Operator,window=_blank] +<2> The `tls` SecretClass created by the Secret Operator, storing its CA in a Kubernetes Secret. Any other SecretClass can be used as well +<3> The `service=prometheus-kube-prometheus-prometheus` scope is added to include the `subjectAlternateName` of the Prometheus Service in the generated TLS certificate. This particular Service name, used here, refers to the Prometheus Service deployed by the {prometheus-operator}[Prometheus Operator,window=_blank] +<4> The path where the mTLS certificates are mounted inside the Prometheus Pod If you want to use a `ServiceMonitor` you'd need to configure it as follows: -// TODO: The ServiceMonitor should be switched to the -metrics service - [source,yaml] ---- --- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: - name: scrape-nifi2 + name: scrape-nifi-2 labels: stackable.tech/vendor: Stackable release: prometheus spec: endpoints: - - port: https - path: 'nifi-api/flow/metrics/prometheus' + - path: /nifi-api/flow/metrics/prometheus + port: https scheme: https - interval: 5s - tlsConfig: - insecureSkipVerify: true - authorization: - credentials: <1> - key: "nifi_token" - name: "nifi-authorization-secret" - optional: false - type: "Bearer" - relabelings: <2> + tlsConfig: # <1> + caFile: /stackable/tls/ca.crt + certFile: /stackable/tls/tls.crt + keyFile: /stackable/tls/tls.key + relabelings: # <2> - sourceLabels: - __meta_kubernetes_pod_name - __meta_kubernetes_service_name - __meta_kubernetes_namespace - __meta_kubernetes_pod_container_port_number targetLabel: __address__ - replacement: ${1}.${2}.${3}.svc.cluster.local:${4} - regex: (.+);(.+?)(?:-headless)?;(.+);(.+) + replacement: ${1}.${2}-headless.${3}.svc.cluster.local:${4} # <3> + regex: (.+);(.+?)(?:-metrics)?;(.+);(.+) selector: matchLabels: + stackable.tech/vendor: Stackable prometheus.io/scrape: "true" namespaceSelector: any: true jobLabel: app.kubernetes.io/instance ---- -<1> Authorization via Bearer Token stored in a secret -<2> Relabel \\__address__ to be a FQDN rather then the IP-Address of target pod +<1> In the TLS configuration of the ServiceMonitor, specify the location of the cert and key files mounted into the Prometheus Pod +<2> Relabel `__address__` to be a FQDN rather then the IP-Address of the target Pod. This is currently necessary to scrape NiFi, since it requires a DNS name to address the NiFi REST API +<3> Currently, the NiFi StatefulSet only offers using FQDNs for NiFi Pods through the `headless` Service, which is why we use the `headless` Service instead of the `metrics` Service to scrape NiFi metrics + +NOTE: The SDP exposes a dedicated `metrics` Service since the xref:listener-operator:listener.adoc[Listener integration]. -NOTE: As of xref:listener-operator:listener.adoc[Listener] integration, SDP exposes a Service with `-headless` thus we need to regex this suffix. +The described example is part of the https://github.com/stackabletech/demos/blob/main/stacks/monitoring/prometheus.yaml[Prometheus,window=_blank] +and https://github.com/stackabletech/demos/blob/main/stacks/monitoring/prometheus-service-monitors.yaml[ServiceMonitor,window=_blank] manifests +used in the https://github.com/stackabletech/demos/blob/main/stacks/monitoring[monitoring stack,window=_blank] of the https://github.com/stackabletech/demos[demos repository,window=_blank].