Skip to content

[BUG] Unexpected lenient query behavior for query_string queries?Β #18886

@peteralfonsi

Description

@peteralfonsi

Describe the bug

While debugging for #18883 I saw some confusing behavior for the lenient option for query_string queries.

Based on the documentation, lenient queries are meant to suppress exceptions from data type mismatches (for example string search on a numeric field). You can provide a boolean lenient param on the query itself, and if that's not there, it should fall back to the index setting index.query_string.lenient which defaults to false. If lenient queries are on, RuntimeExceptions from query parsing get swallowed (line 799 of QueryStringQueryParser.getRegexpQuerySingle), and the query turns into a MatchNoDocsQuery with a description of the original error (link).
From the end user perspective this is not clear at all and it looks like the query succeeded fine just with no hits.

Reading this, I would expect lenient behavior shouldn't activate unless the user has explicitly enabled it in some way, either via setting or param. But this isn't what happens in QueryStringQueryBuilder.doToQuery().

First we have this.lenient which is set from the query builder. By default this will be null. Then we calculate isLenient which equals lenient, or context.queryStringLenient() which pulls from the index setting if lenient==null.

Then there are 3 cases depending if there is a default field, etc. But all of them check if all the fields are all wildcard in some way, for example QueryParserHelper.hasAllFieldsWildcard(defaultFields), and if so, we default to using lenient behavior if this.lenient is null. If not, we use the isLenient variable from before.

It seems on a simple query string where the string starts with a field name and then :, defaultFields is just ["*"], so the hasAllFieldsWildcard check is true, and we get a lenient query parser even though we never opted into lenient queries. The response then hides any failures, including stuff like TooComplexToDeterminizeException.

To me this seems like unintended behavior. I don't think the query should silently transform into MatchNoDocsQuery unless the user has explicitly opted into it. But I wanted to get feedback from others to see if I'm missing something here.

Related component

Search

To Reproduce

On a local cluster with IntelliJ debugger, run a regex query_string query:

curl -XPUT "localhost:9200/text_regex" -H 'Content-Type: application/json' -d'
          {
          "mappings": {
            "properties": {
                "f1": {
                "type": "keyword"
                }
            }
            }
        }'
curl -XPUT "localhost:9200/text_regex/_doc/1" -H 'Content-Type: application/json' -d'
          {
          "f1": "value"
        }'
curl -XGET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
          {
          "query": {
          "query_string": {
                "query": "f1:value"
              }
        }
        }'

Put a breakpoint at line 906 of QueryStringQueryBuilder.java and see we get a QueryStringQueryParser with lenient behavior.

Expected behavior

Lenient behavior should probably default to false as described in the docs.

Additional Details

Metadata

Metadata

Assignees

No one assigned

    Labels

    SearchSearch query, autocomplete ...etcbugSomething isn't working

    Type

    No type

    Projects

    Status

    πŸ†• New

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions