-
Notifications
You must be signed in to change notification settings - Fork 267
add extensive CREPA scheduling and thresholding options #2429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @kabachuha i added a recoverable mode in case the similarity drops over long finetunes. could be useful. it uses an ema so that we don't have a flickering effect of it recovering too quickly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds comprehensive CREPA (Cross-frame Representation Alignment) coefficient scheduling capabilities for video model fine-tuning. The changes enable dynamic adjustment of CREPA regularization strength during training through warmup, decay schedules, step-based cutoffs, and similarity-based automatic disabling to prevent artifacts in text2video scenarios.
Changes:
- Introduced
CrepaSchedulerclass with support for constant, linear, cosine, and polynomial decay schedules - Added warmup, decay, and cutoff mechanisms with both step-based and similarity-threshold triggers
- Extended documentation across 6 languages (English, Spanish, Portuguese, Japanese, Hindi, Chinese) with detailed configuration examples
- Comprehensive test suite covering all scheduling functionality
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| simpletuner/helpers/training/crepa.py | Core scheduler implementation with warmup, decay, and cutoff logic |
| tests/test_crepa.py | Comprehensive test suite for all scheduler behaviors |
| simpletuner/simpletuner_sdk/server/services/field_registry/sections/loss.py | UI field definitions for all new scheduling parameters |
| simpletuner/helpers/models/common.py | Integration of scheduler into model foundation with step tracking |
| documentation/OPTIONS.md (+ 5 translations) | Detailed parameter documentation with examples |
| documentation/experimental/VIDEO_CREPA.md (+ 5 translations) | Extended usage guide with recommended configurations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -1,9 +1,10 @@ | |||
| import math | |||
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The math import at line 1 is added but the existing CrepaRegularizer class already imports math in crepa.py at line 2. While this is fine for the test file (since it uses math indirectly through the scheduler tests), consider whether any test code directly uses the math module. If not, this import may be unnecessary since all math operations are handled within the CrepaScheduler class itself.
| import math |
| self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant") or "constant").lower() | ||
| self.base_weight = float(getattr(config, "crepa_lambda", 0.5) or 0.0) | ||
| self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0) or 0) | ||
| raw_decay_steps = getattr(config, "crepa_decay_steps", 0) or 0 | ||
| self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps | ||
| self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0) or 0.0) | ||
| self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0) or 0) | ||
| self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None) | ||
| if self.similarity_threshold is not None: | ||
| self.similarity_threshold = float(self.similarity_threshold) | ||
| raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None) | ||
| self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99 | ||
| self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent") or "permanent").lower() | ||
| self.power = float(getattr(config, "crepa_power", 1.0) or 1.0) |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 19-23 use a pattern of getattr(config, key, default) or default which is redundant. If the attribute doesn't exist, getattr already returns the default. The or default pattern only helps if the attribute exists but is None/0/empty. Consider simplifying to just use the getattr default, or clarify the intent if empty string/0 values need special handling.
| self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant") or "constant").lower() | |
| self.base_weight = float(getattr(config, "crepa_lambda", 0.5) or 0.0) | |
| self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0) or 0) | |
| raw_decay_steps = getattr(config, "crepa_decay_steps", 0) or 0 | |
| self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps | |
| self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0) or 0.0) | |
| self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0) or 0) | |
| self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None) | |
| if self.similarity_threshold is not None: | |
| self.similarity_threshold = float(self.similarity_threshold) | |
| raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None) | |
| self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99 | |
| self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent") or "permanent").lower() | |
| self.power = float(getattr(config, "crepa_power", 1.0) or 1.0) | |
| self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant")).lower() | |
| self.base_weight = float(getattr(config, "crepa_lambda", 0.5)) | |
| self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0)) | |
| raw_decay_steps = getattr(config, "crepa_decay_steps", 0) | |
| self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps | |
| self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0)) | |
| self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0)) | |
| self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None) | |
| if self.similarity_threshold is not None: | |
| self.similarity_threshold = float(self.similarity_threshold) | |
| raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None) | |
| self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99 | |
| self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent")).lower() | |
| self.power = float(getattr(config, "crepa_power", 1.0)) |
| # Early exit if weight is zero (cutoff active or decayed to zero) | ||
| if scheduled_weight == 0: |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using exact equality comparison with floating point (line 304) could be problematic due to floating point precision. Consider using a small epsilon for comparison (e.g., if scheduled_weight < 1e-8:) or if abs(scheduled_weight) < 1e-8: to handle potential numerical precision issues, especially since weights are computed through mathematical operations like cosine and polynomial functions.
| # Early exit if weight is zero (cutoff active or decayed to zero) | |
| if scheduled_weight == 0: | |
| # Early exit if weight is effectively zero (cutoff active or decayed to ~0) | |
| if abs(scheduled_weight) < 1e-8: |
Closes #2409
This pull request expands and improves the documentation for CREPA coefficient scheduling options across all supported languages. It introduces detailed descriptions and configuration examples for new CREPA scheduling features, including warmup, decay, cutoff, and similarity-based disabling. These updates help users understand and configure CREPA regularization more flexibly and effectively for various fine-tuning scenarios.
CREPA Scheduling Documentation Updates:
Added detailed descriptions for new CREPA scheduling options (
crepa_scheduler,crepa_warmup_steps,crepa_decay_steps,crepa_lambda_end,crepa_power,crepa_cutoff_step,crepa_similarity_threshold,crepa_similarity_ema_decay,crepa_threshold_mode) to the main options documentation in English, Spanish, Portuguese, Japanese, Hindi, and Chinese. [1] [2] [3] [4] [5] [6]Updated example configuration sections in all languages to demonstrate how to use the new scheduling options in
tomlformat. [1] [2] [3] [4] [5] [6]Advanced Usage and Recommendations:
experimental/VIDEO_CREPA.es.md) with a new section on coefficient scheduling, including JSON configuration examples, explanations of scheduler types, and recommended setups for different use cases (i2v, t2v, solid backgrounds).These changes ensure that users in all supported languages have access to clear, comprehensive guidance on advanced CREPA scheduling features.