Skip to content

Conversation

@wa0x6e
Copy link
Contributor

@wa0x6e wa0x6e commented Aug 20, 2025

Fix https://discord.com/channels/955773041898573854/1030772407226601522/1407286799570702387

Situation

We're using a few fields in the ORDER BY leaderbord query, to ensure that all results are always returned in a specific order, in case of duplicates on one or more order by field.

This ensure consistent and unique results while paginating.

Issue

While those fields ensure consistency and uniqueness, it's not optimized for tables with huge amount of rows, since mysql is only able to use a single index on a table.

This creates queries that takes a long time to resolve for spaces with huge amount of rows, like stargate:

image

It's reading 1.2M of rows on each query, and using filesort to sort all the results, resulting in queries taking 5-6 seconds

Fix

This PR will only use the strict minimum necessary columns to ensure uniqueness in returned result:

  • when searching by space, order by the field specified by the user + user (only one user per space)
  • when searching by user, order by the field specified by the user + space (each space is unique per user)
  • when no filter, extra order by will default to last_vote

This results in only using 2 columns in ORDER BY maximum, instead of 4, and speed up the queries.

Test

query {
  leaderboards(where: { space: "stgdao.eth"}, orderBy: "votes_count", skip: 20) {
    user
    votesCount
    proposalsCount
  }
}

This query will resolve in less than 1s, instead of 6s.

@wa0x6e wa0x6e requested a review from Copilot August 20, 2025 18:03

This comment was marked as outdated.

@wa0x6e wa0x6e force-pushed the perf-improve-leadearboard-mysql-query branch from d2ad5cd to 00ba835 Compare August 20, 2025 18:18
@wa0x6e wa0x6e requested a review from Copilot August 20, 2025 18:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes MySQL leaderboard queries by reducing the number of columns in the ORDER BY clause from 4 to 2. The optimization addresses performance issues in spaces with large datasets where queries were taking 5-6 seconds due to MySQL using filesort on 1.2M rows.

Key changes:

  • Implements context-aware ordering based on filter type (space, user, or no filter)
  • Reduces ORDER BY columns to ensure uniqueness while minimizing query complexity
  • Maintains deterministic ordering for pagination consistency

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@wa0x6e wa0x6e force-pushed the perf-improve-leadearboard-mysql-query branch from 00ba835 to c098357 Compare August 20, 2025 18:20
@wa0x6e wa0x6e requested review from ChaituVR and bonustrack August 20, 2025 18:21
@ChaituVR
Copy link
Member

We're using a few fields in the ORDER BY leaderbord query, to ensure that all results are always returned in a specific order, in case of duplicates on one or more order by field.

Why are there duplicates? 🤔 We have a unique space-user list right?

Comment on lines +37 to +44
const orderFields = [orderBy];
if (where.space) {
orderFields.push('user');
} else if (where.user) {
orderFields.push('space');
} else {
orderFields.push('last_vote');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could go wrong if we do something like this?

Suggested change
const orderFields = [orderBy];
if (where.space) {
orderFields.push('user');
} else if (where.user) {
orderFields.push('space');
} else {
orderFields.push('last_vote');
}
const orderFields = [orderBy];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential duplicates on pagination, and random ordering when 2 rows have same value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have any example for leaderboard table?

with this approach, if where.space exists and orderBy is already user, we push user again, which will send only one order 🤔

Copy link
Contributor Author

@wa0x6e wa0x6e Aug 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

this case is for showing the leaderboard of a specific space, and ordering by user.

Since user is unique by space, this ordering does not need additional order fields to ensure uniqueness

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is same on the other way right, space is also unique by user 🤔 leaderboard contain unique index ["user","space"]

Copy link
Member

@ChaituVR ChaituVR Aug 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we just need additional order field only when order is not space/user? or when where is different from orderBy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we just need additional order field only when order is not space/user? or when where is different from orderBy

In all cases, we need 2 order by fields, in case the first one has same values for multiple rows, to avoid doublons on pagination, and have a consistent ordering.

  • On space dashboard: where.spaces exist, order by is the field specified by the ui, like vote_count. If multiple users have the same number of votes, there may be duplicate when loading next pages. By using additional user in the order by, we ensure this ordering is always unique, since user is a unique value by space.
  • Same principle, but in opposite orders for user dashboard
  • When neither where.space nor where.user exists, mean that this request is not coming from our UI, and coming from a direct custom graphql queries from a third party on the hub. In that case, we just last_vote as fallback ordering, as it's the one to have least chance of duplicates.

In case that in space dashboard, the UI also sort by user, the uniq functions will remove the duplicate order by, so we don't have order by user, user.

What kind of other scenario are you thinking about, not covered by this PR ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants