Skip to content

Conversation

@ptamarit
Copy link
Member

@ptamarit ptamarit commented Dec 9, 2024

❤️ Thank you for your contribution!

Description

  • Fixes Improve users search results #150 (see details in description)
  • Commits
    1. The first commits makes the test assertions more precise
    2. The second commit uses CompositeSuggestQueryParser to improve search results

Checklist

Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:

Frontend

Reminder

By using GitHub, you have already agreed to the GitHub’s Terms of Service including that:

  1. You license your contribution under the same terms as the current repository’s license.
  2. You agree that you have the right to license your contribution under the current repository’s license.

@ptamarit ptamarit changed the title 150 user search improve search: improve with CompositeSuggestQueryParser Dec 9, 2024
@ptamarit ptamarit force-pushed the 150-user-search-improve branch from ee9e9e7 to eb1fcfa Compare December 9, 2024 08:22
"name": "profile.full_name",
},
type="most_fields", # https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-multi-match-query.html#multi-match-types
fuzziness="AUTO", # https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#fuzziness
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fuzziness is applied in CompositeSuggestQueryParser.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the most_fields can also be removed, see PR

tree_transformer_cls=SearchFieldTransformer,
fields=["username^2", "email^2", "profile.full_name^3", "profile.affiliations"],
fields=[
"username^2",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
Searching for a username with a dash is still not working well.
For instance, searching from "one-two" seems to search for usernames starting with "one" and starting with "two", and therefore does not find anything.

Copy link
Contributor

@kpsherva kpsherva Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the issue: text field is split by - https://github.com/inveniosoftware/invenio-users-resources/blob/master/invenio_users_resources/records/mappings/os-v2/users/user-v3.0.0.json#L132
if you search by username.keyword, it should work

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot @kpsherva !
2 months later 😅, I confirm that this fixes the issue.

I modified the tests to include a username with a dash, and I added a test showing that usernames with dashes can now be searched properly.

Copy link
Member

@zzacharo zzacharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a long standing PR. Can we maybe have a summary of the improvements or maybe a small demo?

@ptamarit ptamarit force-pushed the 150-user-search-improve branch from eb1fcfa to 87d182e Compare February 21, 2025 14:07
@ptamarit
Copy link
Member Author

This seems like a long standing PR. Can we maybe have a summary of the improvements or maybe a small demo?

@zzacharo Sorry for the long wait. I finally found time to get back to it, and it's now ready for prime time.

What is described in the linked issue is a good summary of the problems:

Searching for a person with a common given name and a common family name includes every user with either one of the names and the ranking does not help.
Searching for users with a dash in their usernames behaves the same way.

Happy to give a demo, but the tests are meant to show what changed in this pull request:

  1. The test modification in the 2nd commit shows that: before, searching for 2 words was including all the users matching at least 1 of the 2 words. Now, searching for 2 words only includes users matching both words. This means that if a user has a common name, adding more words should help restrict the results and find the proper result.
    image
  2. The test modification in the 4th commit shows that: before, searching for pub-res would not return the user with the username pub-res. Now, it works as expected.
    image

Copy link
Contributor

@sakshamarora1 sakshamarora1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a PR with some changes to the parser: inveniosoftware/invenio-records-resources#608
We should test if this works with the new changes. The rest looks good!

@ptamarit
Copy link
Member Author

There is a PR with some changes to the parser: inveniosoftware/invenio-records-resources#608 We should test if this works with the new changes. The rest looks good!

Thanks for the heads up!
I'll re-run CI when a new version of invenio-records-resources is released (currently v7.0.1), and merge if it is still green.

@ptamarit ptamarit assigned ptamarit and unassigned ptamarit Mar 10, 2025
@carlinmack
Copy link
Contributor

new invenio-records-resources was released (7.0.2) so maybe time to re-run tests? I don't have permission to

@utnapischtim utnapischtim moved this to 👀 In review in vNext Jul 14, 2025
@slint slint force-pushed the 150-user-search-improve branch from 8b81e10 to d882598 Compare July 15, 2025 12:14
@utnapischtim utnapischtim marked this pull request as draft July 15, 2025 12:25
@utnapischtim utnapischtim moved this from 👀 In review to 🏗 In progress in vNext Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

Improve users search results

6 participants