-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[FLINK-38464] Introduce OrderedMultiSetState #27071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
/** | ||
* Remove the given element. If there are multiple instances of the same element, remove the | ||
* first one in insertion order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious : should we allow the user to choose LIFO or FIFO for the remove ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While reviewing the potential usages of this data structure (listed in the FLIP document) I couldn't find any that would require removal of the last element.
.../main/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetState.java
Outdated
Show resolved
Hide resolved
* An element was removed, it was not the most recently added, there are more elements. The | ||
* result will not contain any elements | ||
*/ | ||
REMOVED_OTHER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why nothing is returned in this case, this seems inconsistent with REMOVED_LAST_ADDED which will return the element added before it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no use-cases requiring the element removed "from the middle" (or from the beginning) of the data structure.
I think generalizing the contract here to something like "return a row always except for NOTHING_REMOVED" might make it actually more confusing because the semantics is different in different cases: return the removed element in case of ALL_REMOVED, return the new last element in case of REMOVED_LAST_ADDED.
SizeChangeInfo append(T element, long timestamp) throws Exception; | ||
|
||
/** Get iterator over all remaining elements and their timestamps, in order of insertion. */ | ||
Iterator<Tuple2<T, Long>> iterator() throws Exception; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the multi set changes (i.e. there is a removal) under the iterator what will happen? AI unit test for this would be good. It would be useful to understand any locking that has been considered or is in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on the implementation:
- in case of
LinkedMultiSetState
, the iteration is over a copy of the state (so change has no impact) - in case of
ValueStateMultiSetState
, it usesArrayList.iterator()
, which is fail-fast
But since the client code should not make any assumptions about the implementation, I don't think this should be part of the contract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review, I haven't yet reviewed linked
variant and tests).
.../main/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetState.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/flink/table/runtime/orderedmultisetstate/AdaptiveOrderedMultiSetState.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/flink/table/runtime/orderedmultisetstate/AdaptiveOrderedMultiSetState.java
Outdated
Show resolved
Hide resolved
...main/java/org/apache/flink/table/runtime/sequencedmultisetstate/ValueStateMultiSetState.java
Show resolved
Hide resolved
...c/main/java/org/apache/flink/table/runtime/orderedmultisetstate/ValueStateMultiSetState.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/apache/flink/table/runtime/orderedmultisetstate/OrderedMultiSetState.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/flink/table/runtime/orderedmultisetstate/linked/LinkedMultiSetState.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/flink/table/runtime/orderedmultisetstate/linked/LinkedMultiSetState.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/flink/table/runtime/orderedmultisetstate/linked/LinkedMultiSetState.java
Outdated
Show resolved
Hide resolved
.../main/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetState.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/flink/table/runtime/orderedmultisetstate/linked/LinkedMultiSetState.java
Outdated
Show resolved
Hide resolved
...t/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetStateTest.java
Outdated
Show resolved
Hide resolved
...t/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetStateTest.java
Show resolved
Hide resolved
...t/java/org/apache/flink/table/runtime/sequencedmultisetstate/SequencedMultiSetStateTest.java
Show resolved
Hide resolved
.../src/main/java/org/apache/flink/table/runtime/sequencedmultisetstate/linked/MetaSqnInfo.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, as discussed offline, I've shared with you some pointers to still potentially missing test coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. LGTM % green build & assuming comments from @davidradl are also resolved
Thanks a lot for the reviews! |
Introduce
OrderedMultiSetState
and 3 implementations (map, value, adaptive) to be used in SInkUpsertMaterializerV2.Test coverage is currently provided on the operator level (#27070).
I'm planning to add lower-level unit tests later to this PR.