Prevent worst-case exponential complexity in dependency evaluation #10523

julianbrost · 2025-07-28T15:01:21Z

So far, calling Checkable::IsReachable() traversed all possible paths to it's parents. In case a parent is reachable via multiple paths, all it's parents were evaluated multiple times, result in a worst-case exponential complexity.

With this PR, the implementation keeps track of which checkables were already visited and uses the already-computed reachability instead of repeating the computation, ensuring a worst-case linear runtime within the graph size.

For implementing this, there are two additional commits with preparations for that change:

Add T::ConstPtr as a typedef for intrusive_ptr<const T> combined with the necessary changes to support this (changing some functions to accept a const Object* instead of an Object* as well as marking the reference counter attribute as mutable). This was done because Checkable::IsReachable() is const, i.e. this is of type const Checkable*, so a Checkable::Ptr can't be constructed from it, only the new Checkable::ConstPtr.
The actual implementation of Checkable::IsReachable() and DependencyGroup::GetState() is moved to a new helper class (the original methods are kept and now transparently call the ones in the helper class), so that they can easily access common storage (i.e. a cache needed for implementing this) without having to explicitly pass the cache through a public interface or having to resort to thread_local.

Test

I've prepared a config file that creates the following dependency graph that is pretty much the worst case for the old implementation. Note that this diagram is reduced to 4 levels, the actual config generates 25 levels, making the issue much more obvious when running the config.

graph TD;
    2-->0;
    2-->1;
    3-->0;
    3-->1;
    4-->2;
    4-->3;
    5-->2;
    5-->3;
    6-->4;
    6-->5;
    7-->4;
    7-->5;

bobby-little-dependencies.conf

include "/etc/icinga2/icinga2.conf"

var h = "bobby-little-dependencies"
object Host h {
	check_command = "dummy"
}

for (var i in range(25)) { // change this number to scale the depth of the dependency graph
	for (var j in range(2)) {
		var s = 2*i + j 
		object Service s use (h) {
			host_name = h
			check_command = "dummy"
			check_interval = 10s
			retry_interval = 10s
			vars.dummy_text = {{
				log(macro("checking $host.name$!$service.name$"))
			}}
		}
		if (i > 0) {
			for (var k in range(2)) {
				var d = 2*(i-1) + k
				log(String(s) + "-->" + String(d) + ";")
				object Dependency d use (s, d, h) {
					parent_host_name = h
					parent_service_name = d
					child_host_name = h
					child_service_name = s
				}
			}
		}
	}
}

This can easily be started in a container:

docker run --rm -it -v $(pwd)/bobby-little-dependencies.conf:/icinga2.conf:ro icinga/icinga2:dev icinga2 daemon -c /icinga2.conf

Before

Heavy CPU usage (fluctuating somewhere between 200% and 800% on my machine), checks aren't executed every 10s as configured.

After

Almost no CPU usage (mostly idle, peaks of around 2% on my machine), regular check execution.

ref/IP/59145
ref/IP/59671

lib/icinga/dependency-group.cpp

lib/icinga/dependency.hpp

jschmidt-icinga

I've tested this with the given config snippet and verified that this indeed fixes the exponential CPU-time use by the master-branch code. It's still not perfect, I now get ~2% usage vs. ~0.5% with about the same amount of objects (dependencies, hosts and services) in a flat hierarchy, but maybe that can't be avoided anyway, I don't know...

lib/icinga/dependency-state.cpp

lib/icinga/dependency.hpp

julianbrost · 2025-07-29T13:48:20Z

now get ~2% usage vs. ~0.5% with about the same amount of objects (dependencies, hosts and services) in a flat hierarchy

What exactly do you mean by flat? I'd expect the most comparable result from a long chain of dependencies (i.e. 0 -> 1 -> 2 -> 3 -> ...) with the same number of checkables (i.e. length being twice the depth of the graph in my example), though that will probably still be a bit cheaper as you only have half the Dependency objects which are iterated over when determining the reachability.

jschmidt-icinga · 2025-07-29T14:32:45Z

What exactly do you mean by flat?

Same number of objects total, but each service depends on a single host (i.e. the one host).

I'd expect the most comparable result from a long chain of dependencies (i.e. 0 -> 1 -> 2 -> 3 -> ...) with the same number of checkables

That's probably it.

This allows using ref-counted pointers to const objects. Adds a second typedef so that T::ConstPtr can be used similar to how T::Ptr currently is.

yhabteab

Before:

Bildschirmaufnahme.2025-07-31.um.10.15.07.mov

After:

Bildschirmaufnahme.2025-07-31.um.10.23.12.mov

lib/icinga/dependency-group.cpp

Checkable::IsReachable() and DependencyGroup::GetState() call each other recursively. Moving them to a common helper class allows adding caching to them in a later commit without having to pass a cache between the functions (through a public interface) or resorting to thread_local variables.

So far, calling Checkable::IsReachable() traversed all possible paths to it's parents. In case a parent is reachable via multiple paths, all it's parents were evaluated multiple times, result in a worst-case exponential complexity. With this commit, the implementation keeps track of which checkables were already visited and uses the already-computed reachability instead of repeating the computation, ensuring a worst-case linear runtime within the graph size.

julianbrost added bug Something isn't working core/quality Improve code, libraries, algorithms, inline docs ref/IP area/runtime Downtimes, comments, dependencies, events labels Jul 28, 2025

cla-bot bot added the cla/signed label Jul 28, 2025

julianbrost requested review from jschmidt-icinga and yhabteab July 29, 2025 08:10

julianbrost assigned jschmidt-icinga and yhabteab Jul 29, 2025

yhabteab reviewed Jul 29, 2025

View reviewed changes

lib/icinga/dependency-group.cpp Outdated Show resolved Hide resolved

lib/icinga/dependency.hpp Outdated Show resolved Hide resolved

jschmidt-icinga reviewed Jul 29, 2025

View reviewed changes

lib/icinga/dependency-state.cpp Show resolved Hide resolved

lib/icinga/dependency-state.cpp Show resolved Hide resolved

lib/icinga/dependency.hpp Outdated Show resolved Hide resolved

lib/icinga/dependency.hpp Outdated Show resolved Hide resolved

Allow intrusive_ptr<const T> for objects

a49ec10

This allows using ref-counted pointers to const objects. Adds a second typedef so that T::ConstPtr can be used similar to how T::Ptr currently is.

julianbrost force-pushed the dependency-eval-complexity branch 2 times, most recently from 05c92c8 to 0f6185a Compare July 30, 2025 15:20

jschmidt-icinga previously approved these changes Jul 31, 2025

View reviewed changes

julianbrost requested a review from yhabteab July 31, 2025 08:14

yhabteab reviewed Jul 31, 2025

View reviewed changes

lib/icinga/dependency-group.cpp Outdated Show resolved Hide resolved

julianbrost added 3 commits August 4, 2025 10:42

Document current dependency recursion limit

9601468

julianbrost dismissed jschmidt-icinga’s stale review via 9601468 August 4, 2025 08:43

julianbrost force-pushed the dependency-eval-complexity branch from 0f6185a to 9601468 Compare August 4, 2025 08:43

julianbrost requested review from yhabteab and jschmidt-icinga August 4, 2025 08:44

jschmidt-icinga approved these changes Aug 5, 2025

View reviewed changes

yhabteab approved these changes Aug 5, 2025

View reviewed changes

yhabteab merged commit 1f92ec6 into master Aug 5, 2025
29 checks passed

yhabteab deleted the dependency-eval-complexity branch August 5, 2025 09:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent worst-case exponential complexity in dependency evaluation #10523

Prevent worst-case exponential complexity in dependency evaluation #10523

julianbrost commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

jschmidt-icinga left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

julianbrost commented Jul 29, 2025

Uh oh!

jschmidt-icinga commented Jul 29, 2025

Uh oh!

yhabteab left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Prevent worst-case exponential complexity in dependency evaluation #10523

Prevent worst-case exponential complexity in dependency evaluation #10523

Conversation

julianbrost commented Jul 28, 2025

Test

Before

After

Uh oh!

Uh oh!

Uh oh!

jschmidt-icinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

julianbrost commented Jul 29, 2025

Uh oh!

jschmidt-icinga commented Jul 29, 2025

Uh oh!

yhabteab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!