Skip to content

Conversation

belliottsmith
Copy link
Contributor

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@iamaleksey iamaleksey self-requested a review September 11, 2025 10:11
Copy link
Member

@iamaleksey iamaleksey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, with a few nits addressed in a feedback commit.

@belliottsmith belliottsmith force-pushed the lazy-vtables branch 4 times, most recently from 8c2c424 to 8357790 Compare September 11, 2025 12:53
@clohfink
Copy link
Contributor

Maybe a test or two? Looks like you're testing the accord table but not verifying the lazy vtable interface at all.

O bound = translate.apply((I)column.type.compose(expression.getIndexValue()));
switch (expression.operator())
{
default: throw new InvalidRequestException("Operator " + expression.operator() + " not supported for txn_id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be generic not explicitly erroring for accord table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, will address that

@belliottsmith
Copy link
Contributor Author

Maybe a test or two? Looks like you're testing the accord table but not verifying the lazy vtable interface at all.

I do not see additional value in dedicated testing, when it is exercised by the accord virtual table tests. Those tables are arguably much more in need of expanded test coverage.

Either way, while it's not an unreasonable request, I also don't have time. So if you want this, I will have to leave the patch unmerged for the foreseeable future, and use it only internally.

@clohfink
Copy link
Contributor

So if you want this, I will have to leave the patch unmerged for the foreseeable future, and use it only internally.

Ok I will be testing it out a little as was (trying) to migrate the PartitionKeyStatsTable to use it instead of manually

@belliottsmith
Copy link
Contributor Author

Let me know if you need help figuring out porting. I hope it should be pretty intuitive, and the patch has diffs showing my migration of the Accord tables to it that hopefully provides some examples, but if there's something that's unclear let me know.

public <V> ColumnsCollector add(String name, V input, Function<V, Object> f)
{
ColumnMetadata cm = columnLookup.get(name);
if (!columnFilter.fetches(cm))
Copy link
Contributor

@smiklosovic smiklosovic Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @belliottsmith , I am looking at this patch and you have code like this in vtables:

 .lazyAdd(columns -> {
                        columns.add("ballot", toStringOrNull(txn.ballot()))

I just want to understand it better (havent looked into this in IDE, just here), so ballot is name (argument of add method here, and input is toStringOrNull(txn.ballot()).

While I understand that when ballot is not in column filter, it will return this, that still means that V input is "materialized", basically toStringOrNull(txn.ballot()) is called regardless of whether name is filtered or not.

Is this correct?

When it is, then could not be the materialisation itself postponed? I am asking because I can imagine that if there is some non-trivial transformation of value to that column, it might take quite a while / take some resources while we do not ultimately need it.

V input might be Callable<V> input and then

Object value = f.apply(input.call())
if (value == null)
    return this;

and called like columns.add("ballot", () -> toStringOrNull(txn.ballot()))

I do not know what is worse, to have callable object instantiated every time to postpone it or to call that method even it is useless.

Maybe this can be parameterized and this decision would be left for a developer to decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @smiklosovic. So, there's two ways to be lazy here, and we already support what you suggest. There's two add methods in ColumnsCollector, one that accepts a simple value, and another that accepts an input value and a Function that transforms it to the output value. So the equivalent here would be calling columns.add("ballot", txn.ballot(), AccordDebugKeyspace::toStringOrNull). I simply didn't think it was worth the extra ceremony at this call-site.

The second laziness applies at the clustering key level. If the clustering can be excluded (because for instance we have enough results and the new value comes after the last result we have collected) then we don't even reach this point - the whole row is discarded without considering any columns.

We could also fairly easily expand this early filtering to apply RowFilter expressions to clusterings before we materialise columns.

Copy link
Contributor

@smiklosovic smiklosovic Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the equivalent here would be calling columns.add("ballot", txn.ballot(), AccordDebugKeyspace::toStringOrNull)

Aha right, I see ... basically the "callable" in my comment is conceptually the same as the transformation function itself. Yeah ...

While it might be theoretically expensive to even call txn.ballot() (replace with any other arbitrary object / method which needs to be passed / called), I can see that the current code is just fine for the common usage. These columns just hold simple objects / strings / numbers etc. so passing them is basically for free.

I leave your suggestion in the last sentence to your judgement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can handle your case easily by providing txn as the input, and mapping it with t -> toStringOrNull(t.ballot()). The important thing is that this structure does not allocate a capturing lambda.

I mostly noted the additional filtering as something we could do in future fairly easily. I’ve implemented all of the functionality I expect to need for the moment, though perhaps I will discover something missing when I roll it out.

@iamaleksey iamaleksey self-requested a review September 12, 2025 09:39
ifesdjeen and others added 3 commits September 18, 2025 15:49
To support recovering a node that has lost some of its local transaction log, introduce rebootstrap and unsafe bootstrap modes, where Accord ensures no responses are produced for transactions the node cannot be certain it had not previously answered.

patch by Benedict and Alex Petrov for CASSANDRA-20908
Also Improve:
 - Searchable system_accord_debug.txn

patch by Benedict; reviewed by Aleksey Yeschenko for CASSANDRA-20899
 - Integrate txn_blocked_by deadline and depth filter with execution logic, to ensure we terminate promptly and get a best effort reply
@ifesdjeen
Copy link
Contributor

ifesdjeen commented Sep 18, 2025

I did not yet have a chance to fully review the patch, but since this patch mostly ensures that Accord tables are lazy, I think the amount of testing here is appropriate. I would personally not add extensive tests to, say, test whether or not we are seeing all coordinations, etc, since this will likely just lead to tightly coupling observability with internals (which we unfortunately have in too many places already.

I thought we were talking about remote vtables with @clohfink in MN, so I briefly looked at the other patch, too. And from what I could see, the tables that are remote, we do have test coverage.

I think testing for laziness itself is probably also a bit excessive, but it also does not seem that it was requested here, so just mentioning this for completeness. We never verify laziness property in iterators, either. Nor do we verify eagerness of the existing vtables. I think this is just an implementation detail. Main thing they work as expected.

tl;dr it does seem that we have enough tests to verify that tables that are implemented as lazy, which are AccordDebugKeyspace tables. Moreover, these tables are hidden behind a feature flag that disables them by default (even two feature flags, one is accord, and second one accord debug tables themselves). Maybe I am missing something, so if you could elaborate what kind of test is expected, I could try to maybe address this.

That said, if we want to have exhaustive coverage for both lazy and distributed tables machinery, I think we should also do that separately: we could generate different random virtual tables, populate them with different test data, and make sure same thing comes out from both ways. Which, I think, should probably have been done when virtual tables first introduced. But I would definitely not enforce it on either of these patches. Could be a fun, separate, introductory ticket to Cassandra.

@belliottsmith
Copy link
Contributor Author

@clohfink If you consider the integration tests inadequate for a general purpose facility, I will make this class Accord-only for now so that it is not a concern. If you or @smiklosovic want to use this facility for other use cases, you can decide between yourselves what API tests you consider appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants