Skip to content

As/multi app impl exec mgr#125

Merged
alsala merged 15 commits intodevfrom
as/multi_app_impl_exec_mgr
Mar 25, 2026
Merged

As/multi app impl exec mgr#125
alsala merged 15 commits intodevfrom
as/multi_app_impl_exec_mgr

Conversation

@alsala
Copy link
Copy Markdown
Contributor

@alsala alsala commented Mar 23, 2026

Update Manager and Executor to support multiple applications.

See https://horizenlabs.atlassian.net/browse/HZN-2803

@alsala alsala marked this pull request as ready for review March 24, 2026 10:48
@alsala alsala changed the base branch from dev to st/HZN-2801-multi-app March 24, 2026 10:51
@alsala alsala changed the base branch from st/HZN-2801-multi-app to dev March 24, 2026 10:51
@alsala alsala requested review from andreanistico and saratnt March 24, 2026 10:51
EXECUTOR_PORT='4000'
EXECUTOR_FUEL_PRICE_PER_UNIT='1'
EXECUTOR_MIN_FEE_PER_REQUEST='10'
EXECUTOR_MAX_CACHED_MODULES='0'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This var should be added in docker-compose.yml

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

m.log.Warn("Manager: State root mismatch for app %d, expected %x, got %x. Checking if it is a REORG.", appID, localStateRoot, stateRoot)

isReorg, err := m.checkIfReorg(stateRoot)
isReorg, err := m.checkIfReorg(appID, stateRoot)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the reorg check is at app-level instead of chain-level? It should be enough to have a single timer. In addition, we would need a way to reset all the expired timers, otherwise an app could get another reorg with an old expired timer and wrongly assume that the timer is related to the same reorg and so triggering a dangerous rollback

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixed. I also added a unit test for this case.

@@ -203,6 +211,10 @@ func (r *WasmtimeRuntime) getOrLoadModule(ctx context.Context, appId common.Appl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safer to take here directly a write lock, instead of a read one. In any case the write lock must be taken after RUnlock and if you're unlucky the module is evicted before regaining the write lock again.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified this code so now it should be ok

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the read lock, that is useless in this case

}
for r.accessOrder.Len() > r.maxCachedModules {
back := r.accessOrder.Back()
if back == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that back is nil? r.accessOrder.Len() must be at least 2

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


r.log.Info("Runtime: Setting maxCachedModules from %d to %d", r.maxCachedModules, max)
r.maxCachedModules = max
r.evictIfNeeded()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably safer to avoid calling evictIfNeeded here and let just the "loadmodule" path to take care of cleaning modules. This way we avoid the risk of having 2 go routines that can modify the list of modules at the same time

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, done.

@alsala alsala requested a review from saratnt March 24, 2026 17:24
return &adminMessage{Type: keyAttestationRequest, Target: "executor"}
}

func buildGetWasmCacheSize() (*adminMessage, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I have noticed that all the build* functions return an error, even when it is not needed, like in this case. The only exception is buildKeyAttestation. If it could be important that all these functions have the same signature, then buildKeyAttestation should be updated, otherwise the error can be removed from function where it is not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I opted for keeping the same signature


Changes the WASM module LRU cache limit at runtime. If the new limit is lower
than the current number of cached modules, excess modules are evicted
immediately (least recently used first). Evicted modules are transparently
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer true, excess modules are evicted next time a module is loaded

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if cc, ok := e.runtime.(moduleCacheController); ok {
cc.SetMaxCachedModules(req.MaxCachedModules)
} else {
e.log.Warn("Executor: set_wasm_cache_size ignored (runtime does not support module caching)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, shouldn't it return an error? If not, at least it should return 0 instead of req.MaxCachedModules, otherwise is misleading

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, the same also applied to the get cmd

@@ -203,6 +211,10 @@ func (r *WasmtimeRuntime) getOrLoadModule(ctx context.Context, appId common.Appl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the read lock, that is useless in this case

@alsala alsala requested a review from saratnt March 25, 2026 10:01
@alsala alsala merged commit 54178d0 into dev Mar 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants