Cleanup: dependency groups, FT-Transformer as default combiner for 3+ features#4093
Open
w4nderlust wants to merge 6 commits intomainfrom
Open
Cleanup: dependency groups, FT-Transformer as default combiner for 3+ features#4093w4nderlust wants to merge 6 commits intomainfrom
w4nderlust wants to merge 6 commits intomainfrom
Conversation
- Add [dependency-groups] for dev and docs in pyproject.toml - Default combiner to ft_transformer when 3+ input features (wins on both Adult Census AUC and California Housing RMSE)
- CI: migrate all workflows from pip to uv (10-100x faster installs) - Distributed: remove DDPStrategy, FSDPStrategy, DeepSpeedStrategy, DeepSpeedBackend. All distributed training now uses AccelerateStrategy. Legacy strategy names (ddp, fsdp, deepspeed) alias to accelerate. - LLM: simplify DictWrapper docstring (DeepSpeed workaround no longer needed) - Serving: add optional request body logging (LUDWIG_LOG_REQUEST_BODY=1) with body truncation and request reconstruction for downstream handlers
…ults Split Ray worker checkpoint saves: - train_fn: model weights saved via SafeTensors (secure, no pickle), metadata (validation_field, validation_metric) saved as JSON, remaining results via torch.save - eval_fn: eval results saved as JSON where possible, falls back to torch.save for complex objects - Both load paths support legacy torch.save format for backward compat Eliminates pickle-based serialization for model weights in Ray workers.
…nfig helper Rename LudwigSchemaField -> SchemaField (cleaner name, no marshmallow reference). Rename all MarshmallowField subclasses to ConfigField equivalents: DefaultMarshmallowField -> DefaultConfigField SchedulerMarshmallowField -> SchedulerConfigField SearchAlgorithmMarshmallowField -> SearchAlgorithmConfigField ExecutorMarshmallowField -> ExecutorConfigField GradientClippingMarshmallowField -> GradientClippingConfigField ProfilingMarshmallowField -> ProfilingConfigField LRSchedulerMarshmallowField -> LRSchedulerConfigField AugmentationContainerMarshmallowField -> AugmentationContainerConfigField PreprocessingMarshmallowField -> PreprocessingConfigField Add SchemaField.deserialize_config() helper that centralizes the common isinstance(value, dict) dispatch pattern used by all 13 subclasses. Backward compat alias LudwigSchemaField = SchemaField kept.
… 3.12) uv is stricter about building from source and GPy 1.9.9 (transitive dep from HEBO) fails to compile on Python 3.12 (removed longintrepr.h). Keep uv for torch install (binary-only, 10x faster) but use pip for the test extras which include GPy via HEBO.
comet_ml internally imports the imp module which was removed in Python 3.12. Skip the test until comet_ml releases a compatible version.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cleanup PR addressing all remaining skipped items from Phases 0-3.
Phase 0: Build System
Phase 1: ECD
Phase 2: Training
Phase 3: Serving
Remaining torch.save/load (justified)
All model weight serialization now uses SafeTensors.
Test plan