-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Describe the bug
While debugging for #18883 I saw some confusing behavior for the lenient
option for query_string
queries.
Based on the documentation, lenient queries are meant to suppress exceptions from data type mismatches (for example string search on a numeric field). You can provide a boolean lenient
param on the query itself, and if that's not there, it should fall back to the index setting index.query_string.lenient
which defaults to false. If lenient queries are on, RuntimeExceptions from query parsing get swallowed (line 799 of QueryStringQueryParser.getRegexpQuerySingle), and the query turns into a MatchNoDocsQuery with a description of the original error (link).
From the end user perspective this is not clear at all and it looks like the query succeeded fine just with no hits.
Reading this, I would expect lenient behavior shouldn't activate unless the user has explicitly enabled it in some way, either via setting or param. But this isn't what happens in QueryStringQueryBuilder.doToQuery()
.
First we have this.lenient
which is set from the query builder. By default this will be null. Then we calculate isLenient
which equals lenient
, or context.queryStringLenient()
which pulls from the index setting if lenient==null
.
Then there are 3 cases depending if there is a default field, etc. But all of them check if all the fields are all wildcard in some way, for example QueryParserHelper.hasAllFieldsWildcard(defaultFields)
, and if so, we default to using lenient behavior if this.lenient
is null. If not, we use the isLenient
variable from before.
It seems on a simple query string where the string starts with a field name and then :
, defaultFields
is just ["*"]
, so the hasAllFieldsWildcard
check is true, and we get a lenient query parser even though we never opted into lenient queries. The response then hides any failures, including stuff like TooComplexToDeterminizeException
.
To me this seems like unintended behavior. I don't think the query should silently transform into MatchNoDocsQuery unless the user has explicitly opted into it. But I wanted to get feedback from others to see if I'm missing something here.
Related component
Search
To Reproduce
On a local cluster with IntelliJ debugger, run a regex query_string query:
curl -XPUT "localhost:9200/text_regex" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"f1": {
"type": "keyword"
}
}
}
}'
curl -XPUT "localhost:9200/text_regex/_doc/1" -H 'Content-Type: application/json' -d'
{
"f1": "value"
}'
curl -XGET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {
"query_string": {
"query": "f1:value"
}
}
}'
Put a breakpoint at line 906 of QueryStringQueryBuilder.java and see we get a QueryStringQueryParser with lenient behavior.
Expected behavior
Lenient behavior should probably default to false as described in the docs.
Additional Details
Metadata
Metadata
Assignees
Labels
Type
Projects
Status