Skip to content

Refactor: Replace duplicative CsvValidationService with CsvParser concerns and lambda validators#1152

Closed
orangewolf wants to merge 4 commits intobulkrax-v2-importerfrom
refactor/validation-lambdas
Closed

Refactor: Replace duplicative CsvValidationService with CsvParser concerns and lambda validators#1152
orangewolf wants to merge 4 commits intobulkrax-v2-importerfrom
refactor/validation-lambdas

Conversation

@orangewolf
Copy link
Copy Markdown
Member

Summary

This PR eliminates the ~3,200-line CsvValidationService namespace that re-implemented CSV parsing, column resolution, field mapping, and row categorization already handled by CsvParser and CsvEntry.

Changes

Architecture

  • CsvParser::CsvValidation concern — all validation logic lives here, using CsvParser.validate_csv as the single entry point
  • CsvParser::CsvTemplateGeneration concern — template generation extracted cleanly
  • Bulkrax::CsvRow::* validators — 4 lambda-style callables in app/validators/ (DuplicateIdentifier, ParentReference, RequiredValues, ControlledVocabulary)
  • Bulkrax::CsvTemplate::* — template sub-components renamed from CsvValidationService::
  • Bulkrax.config — validators now configurable via csv_row_validators / register_csv_row_validator

Deleted

  • CsvValidationService facade and all duplicative sub-components (CsvParser, ColumnResolver, ItemExtractor, Validator, RowValidatorService + 5 sub-validators, MappingManager)
  • row_validator_service config (replaced by csv_row_validators)

Preserved

  • Validation result hash shape (guided import UI compatibility)
  • Template generation functionality (untouched logic, just moved namespace)
  • I18n keys for error messages
  • Extensibility — host apps configure validators via:
    Bulkrax.config do |c|
      c.register_csv_row_validator(MyApp::CustomValidator)
    end

Stats

  • Net reduction: ~3,200+ lines removed
  • Validation now runs through the existing CsvParser records flow instead of a parallel parsing pipeline

Motivation

The validation service duplicated the entire CSV parsing pipeline. This created maintenance burden, divergence risk, and unnecessary complexity. Validators are now simple lambdas called during either file-level or row-level parsing, branching between validation and CsvEntry creation based on the operation mode.

orangewolf and others added 4 commits March 26, 2026 11:01
…Parser

The ~3,200-line CsvValidationService namespace re-implemented CSV parsing,
column resolution, field mapping, and row categorisation that CsvParser and
CsvEntry already handle.  This refactor eliminates that duplication by:

* Adding CsvParser.validate_csv — a standalone class method that reads the
  CSV directly, resolves column names via MappingManager, builds field
  metadata through FieldAnalyzer, runs registered row validators, and
  returns the same result hash the guided-import UI expects.

* Introducing Bulkrax::CsvRowValidators — four lambda-style callables
  (DuplicateIdentifier, RequiredValues, ControlledVocabulary, ParentReference)
  registered on CsvParser via register_csv_row_validator.  Each callable
  receives (record, row_index, context) and mutates context[:errors].

* Slimming CsvValidationService to a template-generation service whose
  validate class method is now a one-line facade over CsvParser.validate_csv.

* Changing Bulkrax.row_validator_service default from RowValidatorService to
  nil; the config attribute is preserved for host apps with custom validators.

* Deleting the 10 duplicative files (CsvValidationService::{CsvParser,
  ColumnResolver, ItemExtractor, Validator, RowValidatorService + 5 sub-files})
  and their specs; updating csv_validation_service_spec.rb accordingly.

Template generation (CsvBuilder, RowBuilder, ExplanationBuilder, ColumnBuilder,
ColumnDescriptor, FilePathGenerator, ModelLoader, FieldAnalyzer, SchemaAnalyzer,
MappingManager, ValueDeterminer, SplitFormatter) is untouched.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pace

Relocates the four CSV row validators from app/services/bulkrax/csv_row_validators/
to app/validators/bulkrax/csv_row/, updating the module namespace from
Bulkrax::CsvRowValidators::* to Bulkrax::CsvRow::*. Specs move from
spec/services/ to spec/validators/ accordingly. Registration calls in
csv_parser.rb updated to use the new class names. Callable interface
(.call(record, row_index, context)) is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Template:: namespace

- Extract validation logic from CsvParser into CsvParser::CsvValidation concern
- Extract template generation into CsvParser::CsvTemplateGeneration concern with TemplateContext inner class
- Rename all CsvValidationService:: components to CsvTemplate:: namespace (13 files)
- Delete CsvValidationService facade and its directory
- Update callers: controllers now call CsvParser.validate_csv / CsvParser.generate_template
- Move and update all specs to spec/services/bulkrax/csv_template/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Row validators are now configured via Bulkrax.config rather than being
registered as class-level state on CsvParser.  The four built-in validators
(DuplicateIdentifier, ParentReference, RequiredValues, ControlledVocabulary)
are the lazy default for Configuration#csv_row_validators so host apps can
extend or replace them with the standard config block:

  Bulkrax.config do |c|
    c.register_csv_row_validator(MyApp::CustomValidator)
    # or replace entirely:
    c.csv_row_validators = [MyApp::OnlyThisValidator]
  end

Changes:
- Configuration gains csv_row_validators (lazy default), csv_row_validators=,
  and register_csv_row_validator; these are forwarded via def_delegators
- Remove row_validator_service and its delegators (superseded)
- CsvValidation concern: remove class_attribute / register_csv_row_validator;
  validate_csv now reads Bulkrax.csv_row_validators instead; drop legacy
  row_validator_service fallback
- Remove the bottom-of-file registration block from csv_parser.rb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@laritakr
Copy link
Copy Markdown
Contributor

moved to #1154

@laritakr laritakr closed this Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants