Phase 5: API code quality - ModelInspector, trainer mixins, registry modernization by w4nderlust · Pull Request #4091 · ludwig-ai/ludwig

w4nderlust · 2026-04-05T08:06:12Z

Summary

Phase 5 of the Ludwig modernization: break up god objects and improve code quality.

1. ModelInspector

Extracts model introspection from the 2400-line LudwigModel into a focused class:

from ludwig.model_inspector import ModelInspector

inspector = ModelInspector(model, config, metadata)
weights = inspector.collect_weights(['linear1.weight'])
summary = inspector.model_summary()
importance = inspector.feature_importance_proxy()

Provides: weight collection, model summary (param counts, layer types, model size), and feature importance estimation.

2. Trainer Mixins

Composable mixins for cross-cutting training concerns:

CheckpointMixin: checkpoint save/restore decision logic
EarlyStoppingMixin: early stopping based on validation metrics
MetricsMixin: metric formatting and logging
BatchSizeTuningMixin: automatic batch size search
ProfilingMixin: wall-clock timing for training operations

3. Registry Modernization

Added to the existing Registry class:

unregister(name): remove registered items (useful for testing)
get_default(): get the default-registered item
list_registered(): list all names excluding default key aliases
Improved docstrings and type annotations

Test plan

14 new tests for ModelInspector and Registry
1155 existing tests pass (0 regressions)
Pre-commit all clean
CI

ModelInspector: extracts weight collection, model summary, and feature importance estimation from LudwigModel god object. Trainer mixins: CheckpointMixin, EarlyStoppingMixin, MetricsMixin, BatchSizeTuningMixin, ProfilingMixin for composable training behavior. Registry: add unregister(), get_default(), list_registered() methods and improved docstrings for type-safe, testable registries.

for more information, see https://pre-commit.ci

- ludwig inspect: CLI command to view model summary, weights, and approximate feature importance from a saved model - Tests for all 5 trainer mixins (checkpoint, early stopping, metrics, batch size tuning, profiling) - Tests for training report generation and model card generation

Remove __iter__ from TrainingStats (was never used for unpacking). Add deprecation warnings to __iter__/__getitem__ on TrainingResults and PreprocessedDataset -- existing code keeps working but emits DeprecationWarning. Internal code updated to use attribute access.

Trainer now inherits from CheckpointMixin, EarlyStoppingMixin, MetricsMixin, and ProfilingMixin. This makes the mixin utility methods (should_checkpoint, should_early_stop, format_metrics, start_timer/stop_timer) available on the Trainer instance for gradual refactoring of inline logic.

w4nderlust force-pushed the api-code-quality branch 3 times, most recently from f82345c to 106a30b Compare April 6, 2026 07:17

w4nderlust and others added 6 commits April 6, 2026 19:45

[pre-commit.ci] auto fixes from pre-commit.com hooks

b31c70b

for more information, see https://pre-commit.ci

fix: remove unused torch import in test_model_inspector

8332a91

w4nderlust force-pushed the api-code-quality branch from 106a30b to 7e4467a Compare April 7, 2026 02:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 5: API code quality - ModelInspector, trainer mixins, registry modernization#4091

Phase 5: API code quality - ModelInspector, trainer mixins, registry modernization#4091
w4nderlust wants to merge 6 commits intodata-pipeline-hyperopt-modernizationfrom
api-code-quality

w4nderlust commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

w4nderlust commented Apr 5, 2026

Summary

1. ModelInspector

2. Trainer Mixins

3. Registry Modernization

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant