Skip to content

Remove old DDP, FSDP, and DeepSpeed distributed strategies#4097

Merged
w4nderlust merged 4 commits intomainfrom
remove-old-dist-strategies
Apr 12, 2026
Merged

Remove old DDP, FSDP, and DeepSpeed distributed strategies#4097
w4nderlust merged 4 commits intomainfrom
remove-old-dist-strategies

Conversation

@w4nderlust
Copy link
Copy Markdown
Collaborator

Summary

  • Removes the legacy DDPStrategy, FSDPStrategy, and DeepSpeedStrategy classes along with the DeepSpeedBackend and DataParallelBackend, replacing them with the unified AccelerateStrategy as the sole distributed training backend
  • Updates the default strategy from "ddp" to "accelerate" and updates all config validation, tests, and comments accordingly
  • HuggingFace Accelerate handles DDP, FSDP, and DeepSpeed via configuration (e.g. fsdp_config, deepspeed_config params) rather than separate class hierarchies

Files deleted

  • ludwig/distributed/ddp.py
  • ludwig/distributed/fsdp.py
  • ludwig/distributed/deepspeed.py
  • ludwig/backend/deepspeed.py

Test plan

  • Verify local training still works with default (accelerate) strategy
  • Verify Ray distributed training works with strategy: accelerate
  • Verify LLM finetuning config validation accepts accelerate strategy
  • Run integration tests for distributed training

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Test Results

   10 files  +1     10 suites  +1   1h 47m 2s ⏱️ + 1m 9s
3 618 tests +7  3 588 ✅ +5  30 💤 +3  0 ❌  - 1 
3 706 runs  +6  3 664 ✅ +5  42 💤 +2  0 ❌  - 1 

Results for commit 5c83ef7. ± Comparison against base commit 4a10993.

This pull request removes 2 and adds 9 tests. Note that renamed tests count towards both.
tests.integration_tests.test_experiment ‑ test_experiment_model_resume_distributed_gpu[deepspeed]
tests.ludwig.backend.test_ray ‑ test_get_trainer_kwargs[ddp]
tests.ludwig.backend.test_ray ‑ test_get_trainer_kwargs[accelerate]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[adult_census_income.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.ecd.yaml]
tests.regression_tests.model.test_old_models ‑ test_model_loaded_from_old_config_prediction_works
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[respiratory]
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[titanic]
tests.regression_tests.model.test_old_models ‑ test_predict_deprecated_model[twitter_bots]
This pull request removes 1 skipped test and adds 4 skipped tests. Note that renamed tests count towards both.
tests.integration_tests.test_experiment ‑ test_experiment_model_resume_distributed_gpu[deepspeed]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[adult_census_income.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.ecd.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[sarcos.ecd.yaml]

♻️ This comment has been updated with latest results.

Replace the legacy DDPStrategy, FSDPStrategy, and DeepSpeedStrategy classes
with the unified AccelerateStrategy as the primary distributed training
backend. HuggingFace Accelerate provides a single abstraction that handles
DDP, FSDP, and DeepSpeed via configuration rather than separate class
hierarchies.

- Delete ludwig/distributed/ddp.py, fsdp.py, deepspeed.py
- Delete ludwig/backend/deepspeed.py and DataParallelBackend base class
- Remove registry entries for "ddp", "fsdp", "deepspeed" strategies
- Remove "deepspeed" backend type and its factory
- Update default strategy from "ddp" to "accelerate"
- Update config validation for LLM finetuning to require "accelerate"
- Update all test references from "ddp"/"deepspeed" to "accelerate"
- Clean up DeepSpeed-specific comments throughout codebase
@w4nderlust w4nderlust force-pushed the remove-old-dist-strategies branch from 2da1cd4 to 1f1f985 Compare April 11, 2026 22:55
@w4nderlust w4nderlust merged commit ca26093 into main Apr 12, 2026
12 checks passed
@w4nderlust w4nderlust deleted the remove-old-dist-strategies branch April 12, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant