Skip to content

Refactor validation service#1154

Merged
laritakr merged 21 commits intobulkrax-v2-importerfrom
refactor-validation-service
Mar 27, 2026
Merged

Refactor validation service#1154
laritakr merged 21 commits intobulkrax-v2-importerfrom
refactor-validation-service

Conversation

@laritakr
Copy link
Copy Markdown
Contributor

@laritakr laritakr commented Mar 26, 2026

Summary

This PR integrates the upstream refactor/validation-lambdas branch into bulkrax-v2-importer, eliminating the ~3,200-line CsvValidationService namespace that re-implemented CSV parsing, column resolution, field mapping, and row categorization already handled by CsvParser and CsvEntry. Validation and template generation are now concerns on CsvParser itself, backed by focused service classes and callable validator modules.

Changes

Architecture

  • CsvParser::CsvValidation concern — all validation logic lives here, using CsvParser.validate_csv as the single entry point. Uses CsvEntry.read_data (not CSV.read) to preserve blank-row filtering identical to a real import.
  • CsvParser::CsvTemplateGeneration concern — template generation extracted cleanly, with a nested TemplateContext class wiring the CsvTemplate::* components
  • Bulkrax::CsvRow::* validators — 4 callable modules in app/validators/ (DuplicateIdentifier, ParentReference, RequiredValues, ControlledVocabulary), each implementing self.call(record, row_number, context)
  • Bulkrax::CsvTemplate::* — template sub-components re-namespaced from CsvValidationService:: to CsvTemplate::, with no logic changes
  • Bulkrax.config — validators now configurable via csv_row_validators / register_csv_row_validator
  • GuidedImportsController — replaces the old GuidedImport controller concern; routes updated accordingly
  • docs/CSV_SERVICE_ARCHITECTURE.md rewritten, STEPPER_IMPLEMENTATION.md updated to reflect current architecture

Deleted

  • CsvValidationService facade and all duplicative sub-components (CsvParser, ColumnResolver, ItemExtractor, Validator, RowValidatorService + 5 sub-validators, MappingManager)
  • row_validator_service config accessor (replaced by csv_row_validators)
  • All corresponding specs under spec/services/bulkrax/csv_validation_service/

Preserved

  • Validation result hash shape (guided import UI compatibility — no frontend changes required)
  • Template generation functionality (logic untouched, namespace moved)
  • I18n keys for all error and validation messages
  • Extensibility — host apps configure validators via:
    Bulkrax.config do |c|
      c.register_csv_row_validator(MyApp::CustomValidator)
    end

Notes

Built extensively on prior PR #1152

We used the upstream PR code directly for:

  • All 12 CsvTemplate::* service classes
  • CsvParser::CsvTemplateGeneration concern
  • CsvParser::CsvValidation concern
  • All 4 CsvRow::* validator modules
  • Bulkrax.csv_row_validators config
  • Controller updates (GuidedImportsController, ImportersController)
  • All 12 csv_template/ specs
  • The 3 CsvRow validator specs (we initially wrote our own, then replaced them with the upstream versions)
  • model_stubbing_helpers.rb

We diverged from upstream in cases where it missed something or was wrong:

  • CsvEntry.read_data instead of CSV.read — upstream used CSV.read, which would have broken blank-row filtering. We caught this from the existing e2e specs.
  • csv_parser_template_spec.rb — upstream had this spec but it wasn't in the PR; we wrote it matching the upstream version once we identified it was missing
  • model_loader_spec.rb — missing entirely from upstream; we wrote it and also fixed the underlying bug in load_models
  • controlled_vocabulary_spec.rb — not in the upstream PR at all; we wrote it ourselves
  • docs/ — upstream left the docs referencing the old architecture; we rewrote them to reflect the current state
  • ModelLoader#safe_constantize — bug fix we added that upstream didn't have

laritakr added 19 commits March 26, 2026 14:01
Add CsvTemplate:: namespace as re-namespaced copy of template-generation components

Copies all template-generation service objects from CsvValidationService::*
into a new CsvTemplate:: module under app/services/bulkrax/csv_template/.
No behaviour changes — the originals remain in place. This is the first step
in moving template generation into a CsvParser concern (upstream integration
of refactor/validation-lambdas).
Introduces app/parsers/concerns/bulkrax/csv_parser/csv_template_generation.rb
with a TemplateContext that drives the CsvTemplate:: components. Includes the
concern in CsvParser and updates CsvValidationService.generate_template to
delegate to CsvParser.generate_template — the service is now a thin shim for
template generation.
Introduces app/parsers/concerns/bulkrax/csv_parser/csv_validation.rb with
CsvParser.validate_csv as a class method. Uses CsvEntry.read_data (not
CSV.read) to preserve blank-row filtering and header normalisation identical
to a real import. Adds four callable validator modules under
app/validators/bulkrax/csv_row/: DuplicateIdentifier, ParentReference,
RequiredValues, and ControlledVocabulary. Not yet wired in — no behaviour
change.
Includes CsvParser::CsvValidation in CsvParser, making CsvParser.validate_csv
available. Adds Bulkrax.csv_row_validators (defaulting to the four CsvRow::
modules) and Bulkrax.register_csv_row_validator to lib/bulkrax.rb as the new
extensibility point for per-row validation. CsvValidationService.validate still
drives its own path — delegation happens in Step 7.
Updates the rights_statement spec stubs in csv_validation_service_spec and
csv_validation_service_end_to_end_spec to target CsvTemplate::FieldAnalyzer
directly, since validation now runs through CsvParser.validate_csv. Removes
the dead additional_validators: parameter from CsvValidationService.validate
and initialize — use Bulkrax.register_csv_row_validator instead.
Switches CsvValidationService instance to use CsvTemplate:: classes
(MappingManager, FieldAnalyzer, ModelLoader, CsvBuilder, FileValidator,
FilePathGenerator, ColumnBuilder). Deletes the 12 duplicate files from the
old csv_validation_service/ namespace and migrates their specs to
spec/services/bulkrax/csv_template/. Updates all remaining spec stubs and
doubles that referenced the old namespace.
Deletes CsvValidationService and all subclasses — validation was never
in production. Controllers now call CsvParser.validate_csv and
CsvParser.generate_template directly. Removes row_validator_service
config from lib/bulkrax.rb. Deletes all associated specs; the
CsvParser concerns and CsvTemplate:: specs are the new home.
@laritakr laritakr added the ignore-for-release ignore this for release notes label Mar 26, 2026
@laritakr laritakr merged commit 637652e into bulkrax-v2-importer Mar 27, 2026
9 checks passed
@laritakr laritakr deleted the refactor-validation-service branch March 27, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ignore-for-release ignore this for release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants