Skip to content

Conversation

@hubertp
Copy link
Collaborator

@hubertp hubertp commented Sep 5, 2025

Pull Request Description

Initial implementation that replaces cache invalidation logic and static dataflow analysis with Reactive caches.

A lot of code is commented out, as either it needs to be re-done or is simply obsolete in the new implementation.
There will be still a lot of changes but basic functionality of loading/executing projects and executing visualizations works.
Invalidation of only the necessary nodes works, by replacing static dataflow analysis with runtime analysis that collects dependencies between UUIDs.

Depends on #13907. Addresses #10525, renders #13219 obsolete.

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
    or the Snowflake database integration, a run of the Extra Tests has been scheduled.
    • If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

Initial implementation that replaces cache invalidation logic and static
dataflow analysis with Reactive caches.

A lot of code is commented out, as either it needs to be re-done or is
simply obsolete in the new implementation.

Depends on #13907.
@hubertp hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Sep 8, 2025
The remaining problems relate to the fact that we are caching all
expressions. That is conflict with the current `enterables` logic for
entering functions from a local call stack.
The latter will need a more involving rewrite.
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inception Comments

I am glad reactive caches are shaping out. My biggest wish is to separate the Observable & co. into own project, provide clear API, documentation and test in isolation.

In order to be able to track dependencies between nodes, also in the
event when no external `UUID` is available, had to allow to have two types
of Observables. Similarly, `UUID`s that are external and internal are now
easily distinguishable thanks to `RuntimeID` wrapper.

Currently some tracking logic is inside nodes themselves. That's
obviously wrong - it should all be in the instrumentation. Follow up
change will move the logic. The current approach is mostly for
achieving a minimal viable proof that the approach will work.
Mostly cosmetic changes due to the fact that we now invalidate less.

But two prominent changes are included:
- invalidation logic when Recompute requests comes
- Fixed tracking method calls/closure execution

The latter uses enter/exit logic in `RuntimeAnalysis` in
`InvokeCallableNode` which is correct but undeseriable. The information
should be gathered in instrumention instead. Explicit calls are easier
to experiment with and will be transfered to instrumentation eventually.
Had to ensure that dependencies are being tracked correctly across
function call boundaries. This is problematic due to the fact that local
calls are being done with their own RuntimeCaches; any invalidation has
to ensure that it is capable of crossing that boundary.

Additionally, had to ensure that self argument is properly
tracked/invalidated or changes withing the functions would not be
translated into Truffle nodes.

The logic is still experiemental and elements of RuntimeAnalysis will
need to move to instrumentation, for performance reasons.
The exception was being propagated, unintentionally it seems, and
showing up in logs.
Spurious method/self arg invalidation leading to large and unnecessary
re-compilations.
Unresolved constructors are being resolved in synthetic constructs.
Sadly, those constructs inside closures, share original IDs and
therefore create false dependencies between nodes.
As we still need the original IDs for instrumentation and expression
updates, introduced a flag that disables runtime tracking for
expressions.
Not yet complete but already found a few corner cases not covered in the
new design.
Extra diagnostics were lost when computation failed at an early
CompletionStage.
Panics aren't cached but visualizations can still be run on them.
Had to add another method to Observable to make it happen.
Have to keep track of arguments applied to the visualization and map
them to synthetic assignnment constructs in visualization code to get
the right identifiers.
If a new external identifier is added within an expression that is
already cached, the latter should be invalidated.
Fixes a number of broken visualization tests.
When a text edit is performed on a code that is being called in the
visualization, a full re-evaluation is needed in order to generate
updated Truffle nodes.

Decided not to support that case, as it is very unlikely to be needed in
the foresable future. Text change will invalidate necessary observers,
as one would expect, but it will not trigger re-evaluation. Instead, one
has to explicitly make `ModifyVisualization` request.
This seems like a good compromise as evaluations of visualization
expressions are very costly (also due to locking).
Workarounds for Stackoverflow issues + IdMap updates causing full
program invalidations. This change breaks 1 test in
RuntimeVisualizationTests that will be addressed separately.
With runtime tracking we are able to precisely discover which cached
values need to be invalidated. Unfortunately excessive use of external
UUIDs (and therefore caching) by GUI revealed a flaw in that logic wrt
to local variable access.

Whenever a local variable is being accessed it is reading a previously
written value from a frame. But if the assignment has been cached, then
a specific slot in the frame is not written during successive execution.
This means that adding new expressions that make use of previously
cached local variables would report uninitialized value errors.

This change workarounds this by **never** caching assignments.
Assignments should be fast if the RHS is already cached, and GUI appears
to assign external UUIDS to RHS always, it seems. That way any reading
from a frame's slot will succeed.

Also reverted to the old tree building in Changeset as it appeared to be
sufficient for our needs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: Clean build required CI runners will be cleaned before and after this PR is built.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants