Refactor: Replace duplicative CsvValidationService with CsvParser concerns and lambda validators#1152
Closed
orangewolf wants to merge 4 commits intobulkrax-v2-importerfrom
Closed
Refactor: Replace duplicative CsvValidationService with CsvParser concerns and lambda validators#1152orangewolf wants to merge 4 commits intobulkrax-v2-importerfrom
orangewolf wants to merge 4 commits intobulkrax-v2-importerfrom
Conversation
…Parser
The ~3,200-line CsvValidationService namespace re-implemented CSV parsing,
column resolution, field mapping, and row categorisation that CsvParser and
CsvEntry already handle. This refactor eliminates that duplication by:
* Adding CsvParser.validate_csv — a standalone class method that reads the
CSV directly, resolves column names via MappingManager, builds field
metadata through FieldAnalyzer, runs registered row validators, and
returns the same result hash the guided-import UI expects.
* Introducing Bulkrax::CsvRowValidators — four lambda-style callables
(DuplicateIdentifier, RequiredValues, ControlledVocabulary, ParentReference)
registered on CsvParser via register_csv_row_validator. Each callable
receives (record, row_index, context) and mutates context[:errors].
* Slimming CsvValidationService to a template-generation service whose
validate class method is now a one-line facade over CsvParser.validate_csv.
* Changing Bulkrax.row_validator_service default from RowValidatorService to
nil; the config attribute is preserved for host apps with custom validators.
* Deleting the 10 duplicative files (CsvValidationService::{CsvParser,
ColumnResolver, ItemExtractor, Validator, RowValidatorService + 5 sub-files})
and their specs; updating csv_validation_service_spec.rb accordingly.
Template generation (CsvBuilder, RowBuilder, ExplanationBuilder, ColumnBuilder,
ColumnDescriptor, FilePathGenerator, ModelLoader, FieldAnalyzer, SchemaAnalyzer,
MappingManager, ValueDeterminer, SplitFormatter) is untouched.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pace Relocates the four CSV row validators from app/services/bulkrax/csv_row_validators/ to app/validators/bulkrax/csv_row/, updating the module namespace from Bulkrax::CsvRowValidators::* to Bulkrax::CsvRow::*. Specs move from spec/services/ to spec/validators/ accordingly. Registration calls in csv_parser.rb updated to use the new class names. Callable interface (.call(record, row_index, context)) is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Template:: namespace - Extract validation logic from CsvParser into CsvParser::CsvValidation concern - Extract template generation into CsvParser::CsvTemplateGeneration concern with TemplateContext inner class - Rename all CsvValidationService:: components to CsvTemplate:: namespace (13 files) - Delete CsvValidationService facade and its directory - Update callers: controllers now call CsvParser.validate_csv / CsvParser.generate_template - Move and update all specs to spec/services/bulkrax/csv_template/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Row validators are now configured via Bulkrax.config rather than being
registered as class-level state on CsvParser. The four built-in validators
(DuplicateIdentifier, ParentReference, RequiredValues, ControlledVocabulary)
are the lazy default for Configuration#csv_row_validators so host apps can
extend or replace them with the standard config block:
Bulkrax.config do |c|
c.register_csv_row_validator(MyApp::CustomValidator)
# or replace entirely:
c.csv_row_validators = [MyApp::OnlyThisValidator]
end
Changes:
- Configuration gains csv_row_validators (lazy default), csv_row_validators=,
and register_csv_row_validator; these are forwarded via def_delegators
- Remove row_validator_service and its delegators (superseded)
- CsvValidation concern: remove class_attribute / register_csv_row_validator;
validate_csv now reads Bulkrax.csv_row_validators instead; drop legacy
row_validator_service fallback
- Remove the bottom-of-file registration block from csv_parser.rb
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
moved to #1154 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR eliminates the ~3,200-line
CsvValidationServicenamespace that re-implemented CSV parsing, column resolution, field mapping, and row categorization already handled byCsvParserandCsvEntry.Changes
Architecture
CsvParser::CsvValidationconcern — all validation logic lives here, usingCsvParser.validate_csvas the single entry pointCsvParser::CsvTemplateGenerationconcern — template generation extracted cleanlyBulkrax::CsvRow::*validators — 4 lambda-style callables inapp/validators/(DuplicateIdentifier,ParentReference,RequiredValues,ControlledVocabulary)Bulkrax::CsvTemplate::*— template sub-components renamed fromCsvValidationService::Bulkrax.config— validators now configurable viacsv_row_validators/register_csv_row_validatorDeleted
CsvValidationServicefacade and all duplicative sub-components (CsvParser, ColumnResolver, ItemExtractor, Validator, RowValidatorService + 5 sub-validators, MappingManager)row_validator_serviceconfig (replaced bycsv_row_validators)Preserved
Stats
CsvParserrecords flow instead of a parallel parsing pipelineMotivation
The validation service duplicated the entire CSV parsing pipeline. This created maintenance burden, divergence risk, and unnecessary complexity. Validators are now simple lambdas called during either file-level or row-level parsing, branching between validation and
CsvEntrycreation based on the operation mode.