feat: Optional taxonomy directory, Umbraco compat, and critical bug fixes#425
Open
Shazwazza wants to merge 14 commits intorelease/4.0from
Open
feat: Optional taxonomy directory, Umbraco compat, and critical bug fixes#425Shazwazza wants to merge 14 commits intorelease/4.0from
Shazwazza wants to merge 14 commits intorelease/4.0from
Conversation
- Add UseTaxonomyIndex property to LuceneIndexOptions (default: true for backwards compatibility) - Update IDirectoryFactory.CreateTaxonomyDirectory to return nullable Directory - Update DirectoryFactoryBase and FileSystemDirectoryFactory to check UseTaxonomyIndex option - Update LuceneIndex to handle null TaxonomyWriter when taxonomy is disabled - Add IsTaxonomyEnabled property to LuceneIndex for runtime checks - Update IndexCommitter to handle null TaxonomyWriter - SyncedFileSystemDirectoryFactory still requires taxonomy (throws if disabled) - Update PublicAPI.Unshipped.txt with new API surface BREAKING CHANGE: IDirectoryFactory.CreateTaxonomyDirectory now returns Directory? instead of Directory
- Document new UseTaxonomyIndex and IsTaxonomyEnabled properties - Mark CreateTaxonomyDirectory methods as returning nullable Directory? - Note that SyncedFileSystemDirectoryFactory requires taxonomy enabled
…y searcher - Add LuceneNonTaxonomySearcher class to handle searches when taxonomy is disabled - Update LuceneIndex.CreateSearcher() to return appropriate searcher based on UseTaxonomyIndex option - Add _nrtReopenThreadNoTaxonomy for NRT support without taxonomy - Update WaitForChanges() to use correct NRT thread - Update Dispose() to clean up non-taxonomy NRT thread - Add 4 new tests for SyncedFileSystemDirectoryFactory without taxonomy: - Given_NoTaxonomyDirectory_When_CreatingDirectory_Then_IndexCreatedSuccessfully - Given_NoTaxonomyDirectory_When_IndexingData_Then_SearchSucceeds - Given_CorruptMainIndex_And_HealthyLocalIndex_NoTaxonomy_When_CreatingDirectory_Then_LocalIndexSyncedToMain - Given_CorruptMainIndex_And_CorruptLocalIndex_NoTaxonomy_When_CreatingDirectory_Then_NewIndexesCreatedAndUsable
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
When taxonomy is disabled, the non-taxonomy overload of FacetsConfig.Build(doc) must still be called to process SortedSetDocValuesFacetField entries into proper SortedSetDocValuesField entries. Without this, documents containing facet fields throw ArgumentException during indexing, causing silent failures where no items are indexed and IndexCommitted events never fire. This restores the behavior from v4.0.0-beta.1 where FacetsConfig.Build(doc) was always called regardless of taxonomy configuration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a SearchContext is created during application startup before any documents have been indexed, SearchableFields reads from the empty index reader and caches an empty array. After documents are indexed and the NRT reader is refreshed, IsSearcherCurrent() returns true (the refreshed reader IS current), so the SearchContext is reused with its stale empty _searchableFields cache. This causes ManagedQuery/Search(string) to generate queries with no fields, returning zero results even though documents exist in the index. Fix: only cache SearchableFields when the result is non-empty. An empty index has nothing to search anyway, and re-reading on each call has negligible cost. Once documents are indexed and fields exist, the non-empty result is cached normally. Applied to both SearchContext (non-taxonomy) and TaxonomySearchContext. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move WaitForChanges() into CommitNow() before the Committed event fires, and remove the redundant call from TimerRelease(). Previously, in the async commit path (timer-based), Committed fired before WaitForChanges() completed, creating a race condition where consumers reacting to the Committed/IndexCommitted event could search with a stale NRT reader that hadn't yet been refreshed to include the just-committed changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CommitNow() now calls WaitForChanges() internally, so the explicit calls after CommitNow() in the !RunAsync paths of PerformIndexItemsInternal and PerformDeleteFromIndexInternal are redundant no-ops. Remove them and update comments for clarity. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes the taxonomy directory optional in Examine v4, ensures backward compatibility with Umbraco CMS and Umbraco.Cms.Search, and fixes three critical bugs discovered during compatibility testing.
Features
Optional Taxonomy Directory
Makes the taxonomy index an opt-in/opt-out feature, enabling lighter-weight index configurations when faceted taxonomy search is not needed.
UseTaxonomyIndexproperty toLuceneIndexOptions(default:truefor backward compat)IsTaxonomyEnabledproperty toLuceneIndexfor runtime checksITaxonomyDirectoryFactoryinterface separated fromIDirectoryFactoryfor cleaner abstractionIDirectoryFactory.CreateTaxonomyDirectorynow returnsDirectory?instead ofDirectoryDirectoryFactoryBaseas[Obsolete]— it exists only for compatibility and adds no valueFileSystemDirectoryFactoryimplementsITaxonomyDirectoryFactorywith type-check at usage sitesSyncedFileSystemDirectoryFactoryupdated to handle optional taxonomy dirLuceneNonTaxonomySearcherfor efficient searching when taxonomy is disabledUmbraco API Compatibility
Bug Fixes
1. FacetsConfig.Build not called for non-taxonomy indexes
Commit:
384550eeWhen taxonomy was disabled,
FacetsConfig.Build(doc)was not being called. This is required even for non-taxonomy indexes to processSortedSetDocValuesFacetFieldentries into properSortedSetDocValuesFieldentries. Without it, documents with facet fields threwArgumentExceptionduring indexing, causing silent failures where no items were indexed andIndexCommittedevents never fired.2. SearchableFields caching empty results from initially empty indexes (fixes #426)
Commit:
7e672a87When a
SearchContextwas created during application startup before any documents were indexed,SearchableFieldsread from the empty index reader and cached an empty array. After documents were indexed and the NRT reader was refreshed,IsSearcherCurrent()returnedtrue(the refreshed reader IS current), so theSearchContextwas reused with its stale empty_searchableFieldscache. This causedManagedQuery/Search(string)to generate queries with no fields, returning zero results even though documents existed in the index.Fix: Only cache
SearchableFieldswhen the result is non-empty. Applied to bothSearchContextandTaxonomySearchContext.3. NRT reader not refreshed before Committed event fires (fixes #427)
Commits:
d606a41d,6f040af3In the async commit path (timer-based), the
Committedevent fired beforeWaitForChanges()completed, creating a race condition where consumers reacting to theCommitted/IndexCommittedevent could search with a stale NRT reader. Consumers would get zero or incomplete results even though the commit had completed.Fix: Move
WaitForChanges()intoCommitNow()before theCommittedevent fires. Also removed the now-redundantWaitForChanges()calls in the synchronous!RunAsyncpaths.Breaking Changes
IDirectoryFactory.CreateTaxonomyDirectoryreturnsDirectory?IDirectoryFactoryimplementationsnullto disable taxonomy, or keep returning a directoryTests
SyncedFileSystemDirectoryFactorywithout taxonomyFiles Changed (22 files, +1741/-372)
Core Changes
src/Examine.Lucene/LuceneIndexOptions.cs—UseTaxonomyIndexoptionsrc/Examine.Lucene/Providers/LuceneIndex.cs— Null taxonomy handling, non-taxonomy NRT, FacetsConfig.Build fix, redundant WaitForChanges cleanupsrc/Examine.Lucene/Providers/IndexCommitter.cs— Race condition fix, null taxonomy writersrc/Examine.Lucene/Providers/LuceneNonTaxonomySearcher.cs— New non-taxonomy searchersrc/Examine.Lucene/Search/SearchContext.cs— Empty cache fixsrc/Examine.Lucene/Search/TaxonomySearchContext.cs— Empty cache fixDirectory Infrastructure
src/Examine.Lucene/Directories/IDirectoryFactory.cs— Nullable return typesrc/Examine.Lucene/Directories/ITaxonomyDirectoryFactory.cs— New interfacesrc/Examine.Lucene/Directories/DirectoryFactory.cs— Updated implsrc/Examine.Lucene/Directories/DirectoryFactoryBase.cs— Marked obsoletesrc/Examine.Lucene/Directories/FileSystemDirectoryFactory.cs— ITaxonomyDirectoryFactorysrc/Examine.Lucene/Directories/SyncedFileSystemDirectoryFactory.cs— Optional taxonomy supportPublic API
src/Examine.Lucene/PublicAPI.Unshipped.txt— New API surface entriesRelated Issues
Backport
Bugs #426 and #427 have been backported to
support/3.xin PR #428.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com