Skip to content

Commit b774eba

Browse files
committed
Adds support to exclude fields from _source
Why are these changes being introduced: * We are starting to add very large fields into our OpenSearch _source, which we are concerned may cause performance issues Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/USE-406 How does this address that need: * Introduces an ENV variable to specify fields to exclude from _source
1 parent 546176c commit b774eba

File tree

3 files changed

+30
-0
lines changed

3 files changed

+30
-0
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,8 @@ locally.
181181
confused.
182182
```
183183

184+
- `OPENSEARCH_SOURCE_EXCLUDES` comma separated list of fields to exclude from the OpenSearch `_source` field. Leave unset to return all fields.
185+
- recommended value: `embedding_full_record,fulltext`
184186
- `PLATFORM_NAME`: The value set is added to the header after the MIT Libraries logo. The logic and CSS for this comes from our theme gem.
185187
- `PREFERRED_DOMAIN` - set this to the domain you would like to to use. Any
186188
other requests that come to the app will redirect to the root of this domain.

app/models/opensearch.rb

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,16 @@ def build_query(from)
4141
sort:
4242
}
4343

44+
# If ENV OPENSEARCH_SOURCE_EXCLUDES is set, use the values in it's comma-separated list;
45+
# otherwise leave out the _source attribute entirely (which will return all fields in _source)
46+
# excludes are used to prevent large fields from being returned in the search results, which can cause performance issues
47+
# these fields are still searchable, just not returned in the search results
48+
if ENV['OPENSEARCH_SOURCE_EXCLUDES'].present?
49+
query_hash[:_source] = {
50+
excludes: ENV['OPENSEARCH_SOURCE_EXCLUDES'].split(',').map(&:strip)
51+
}
52+
end
53+
4454
query_hash[:highlight] = highlight if @highlight
4555
query_hash.to_json
4656
end

test/models/opensearch_test.rb

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -387,4 +387,22 @@ class OpensearchTest < ActiveSupport::TestCase
387387
json = JSON.parse(os.build_query(0))
388388
assert_equal Opensearch::MAX_SIZE, json['size']
389389
end
390+
391+
test 'can exclude fields from _source' do
392+
ClimateControl.modify(OPENSEARCH_SOURCE_EXCLUDES: 'field1,field2') do
393+
os = Opensearch.new
394+
os.instance_variable_set(:@params, {})
395+
json = JSON.parse(os.build_query(0))
396+
assert_equal %w[field1 field2], json['_source']['excludes']
397+
end
398+
end
399+
400+
test 'does not include _source if OPENSEARCH_SOURCE_EXCLUDES is not set' do
401+
ClimateControl.modify(OPENSEARCH_SOURCE_EXCLUDES: nil) do
402+
os = Opensearch.new
403+
os.instance_variable_set(:@params, {})
404+
json = JSON.parse(os.build_query(0))
405+
refute json.key?('_source')
406+
end
407+
end
390408
end

0 commit comments

Comments
 (0)