Skip to content

[NoSQL] Ensure index elements with non-null values#4018

Open
snazy wants to merge 5 commits intoapache:mainfrom
snazy:nosql-idx-null-safe
Open

[NoSQL] Ensure index elements with non-null values#4018
snazy wants to merge 5 commits intoapache:mainfrom
snazy:nosql-idx-null-safe

Conversation

@snazy
Copy link
Member

@snazy snazy commented Mar 18, 2026

Within the current implementation, it is possible that the iterators of the Index API return elements with null values. This may cause NPEs.

This change enforces that index elements returned by the Index API iterators can never yield null values. The actual fix is in IndexSpi using the new iterator base class. The only place where null values could "escape" from the index API was in the .(reverse)iterator() functions in IndexSpi. The iterator implementation has been updated to skip elements with null values.

While the mentione change alone would have been sufficient, it seemed better to make the null-safety guarantees visible at the API level.

Chosing Map.Entry as the type for index elements seemed to be logical at the time of writing. As both Map.Entry.getKey() and Map.Entry.getValue() are nullable, it makes it difficult to "see" potential non-nullable violations in the code base. To address this issue, the Map.Entry interface has been replaced in the API with a new Index.Element interface with @Nonnull annotated functions. The implementation internal and package-private interface IndexElement was renamed to InternalIndexElement, now implementing Index.Element.

Background information: When an "embedded" serialized index becomes too big, it is spilled out to separately stored index stripes. The embedded index then tracks the changes on top of the spilled out index stripes. Such a change can be an update (a different non-null value) or a remove. In case of a removal, the value becomes null. Such removed elements must be visible within the index implementation, but never be exposed via the API.

@snazy snazy requested a review from dimas-b March 18, 2026 10:08
@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Mar 18, 2026
@snazy snazy marked this pull request as draft March 18, 2026 10:43
Within the current implementation, it is possible that the iterators of the `Index` API return elements with `null` values. This may cause NPEs.

This change enforces that index elements returned by the `Index` API iterators can never yield `null` values. The actual fix is in `IndexSpi` using the new iterator base class.
The only place where `null` values could "escape" from the index API was in the `.(reverse)iterator()` functions in `IndexSpi`. The iterator implementation has been updated to skip elements with `null` values.

While the mentione change alone would have been sufficient, it seemed better to make the null-safety guarantees visible at the API level.

Chosing `Map.Entry` as the type for index elements seemed to be logical at the time of writing. As both `Map.Entry.getKey()` and `Map.Entry.getValue()` are nullable, it makes it difficult to "see" potential non-nullable violations in the code base.
To address this issue, the `Map.Entry` interface has been replaced in the API with a new `Index.Element` interface with `@Nonnull` annotated functions.
The implementation internal and package-private interface `IndexElement` was renamed to `InternalIndexElement`, now implementing `Index.Element`.

Background information: When an "embedded" serialized index becomes too big, it is spilled out to separately stored index stripes. The embedded index then tracks the changes on top of the spilled out index stripes. Such a change can be an update (a different non-null value) or a remove. In case of a removal, the value becomes `null`. Such removed elements must be visible within the index implementation, but never be exposed via the API.
@snazy snazy force-pushed the nosql-idx-null-safe branch from 2da3604 to 8565b48 Compare March 18, 2026 10:47
@snazy snazy marked this pull request as ready for review March 18, 2026 10:47
return key().equals(other.key()) && Objects.equals(valueNullable(), other.valueNullable());
}
if (o instanceof Index.Element<?> other) {
return key().equals(other.key()) && Objects.equals(valueNullable(), other.value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the comparison is reversed and other happens to be an IndexElem, then it will use value() from this AbstractIndexElement... so it looks like comparison is not commutative 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need polymorphic comparison for these classes?

Why not consider different sub-types as "not equal" all the time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison is commutative. Yes, the "intentionally broken contract" by InternalIndexElement makes it a bit difficult to reason about.

Internally, the indexes implementation works only with instances of InternalIndexElement, aka AbstractIndexElement. You cannot put() an *Element instance to the index.

With this, call sites can get instances of the internal representation, but as Index.Element and never with a null value.

So from a call-site the comparison is commutative.

var u = embedded.getElement(key);
if (u != null) {
return u.getValue() != null;
return u.valueNullable() != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add u.hasValue() and avoid nullable getters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'd like to avoid any unnecessary calls with in this performance sensitive part.

snazy and others added 4 commits March 19, 2026 18:57
…laris/persistence/nosql/impl/indexes/InternalIndexElement.java

Co-authored-by: Alexandre Dutra <adutra@apache.org>
@snazy snazy requested review from adutra and dimas-b March 19, 2026 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants