Skip to content

Conversation

@bghira
Copy link
Owner

@bghira bghira commented Jan 16, 2026

Closes #2409

This pull request expands and improves the documentation for CREPA coefficient scheduling options across all supported languages. It introduces detailed descriptions and configuration examples for new CREPA scheduling features, including warmup, decay, cutoff, and similarity-based disabling. These updates help users understand and configure CREPA regularization more flexibly and effectively for various fine-tuning scenarios.

CREPA Scheduling Documentation Updates:

  • Added detailed descriptions for new CREPA scheduling options (crepa_scheduler, crepa_warmup_steps, crepa_decay_steps, crepa_lambda_end, crepa_power, crepa_cutoff_step, crepa_similarity_threshold, crepa_similarity_ema_decay, crepa_threshold_mode) to the main options documentation in English, Spanish, Portuguese, Japanese, Hindi, and Chinese. [1] [2] [3] [4] [5] [6]

  • Updated example configuration sections in all languages to demonstrate how to use the new scheduling options in toml format. [1] [2] [3] [4] [5] [6]

Advanced Usage and Recommendations:

  • Expanded the Spanish experimental guide (experimental/VIDEO_CREPA.es.md) with a new section on coefficient scheduling, including JSON configuration examples, explanations of scheduler types, and recommended setups for different use cases (i2v, t2v, solid backgrounds).

These changes ensure that users in all supported languages have access to clear, comprehensive guidance on advanced CREPA scheduling features.

@bghira
Copy link
Owner Author

bghira commented Jan 16, 2026

cc @kabachuha i added a recoverable mode in case the similarity drops over long finetunes. could be useful. it uses an ema so that we don't have a flickering effect of it recovering too quickly.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds comprehensive CREPA (Cross-frame Representation Alignment) coefficient scheduling capabilities for video model fine-tuning. The changes enable dynamic adjustment of CREPA regularization strength during training through warmup, decay schedules, step-based cutoffs, and similarity-based automatic disabling to prevent artifacts in text2video scenarios.

Changes:

  • Introduced CrepaScheduler class with support for constant, linear, cosine, and polynomial decay schedules
  • Added warmup, decay, and cutoff mechanisms with both step-based and similarity-threshold triggers
  • Extended documentation across 6 languages (English, Spanish, Portuguese, Japanese, Hindi, Chinese) with detailed configuration examples
  • Comprehensive test suite covering all scheduling functionality

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
simpletuner/helpers/training/crepa.py Core scheduler implementation with warmup, decay, and cutoff logic
tests/test_crepa.py Comprehensive test suite for all scheduler behaviors
simpletuner/simpletuner_sdk/server/services/field_registry/sections/loss.py UI field definitions for all new scheduling parameters
simpletuner/helpers/models/common.py Integration of scheduler into model foundation with step tracking
documentation/OPTIONS.md (+ 5 translations) Detailed parameter documentation with examples
documentation/experimental/VIDEO_CREPA.md (+ 5 translations) Extended usage guide with recommended configurations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1,9 +1,10 @@
import math
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The math import at line 1 is added but the existing CrepaRegularizer class already imports math in crepa.py at line 2. While this is fine for the test file (since it uses math indirectly through the scheduler tests), consider whether any test code directly uses the math module. If not, this import may be unnecessary since all math operations are handled within the CrepaScheduler class itself.

Suggested change
import math

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +32
self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant") or "constant").lower()
self.base_weight = float(getattr(config, "crepa_lambda", 0.5) or 0.0)
self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0) or 0)
raw_decay_steps = getattr(config, "crepa_decay_steps", 0) or 0
self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps
self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0) or 0.0)
self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0) or 0)
self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None)
if self.similarity_threshold is not None:
self.similarity_threshold = float(self.similarity_threshold)
raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None)
self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99
self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent") or "permanent").lower()
self.power = float(getattr(config, "crepa_power", 1.0) or 1.0)
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 19-23 use a pattern of getattr(config, key, default) or default which is redundant. If the attribute doesn't exist, getattr already returns the default. The or default pattern only helps if the attribute exists but is None/0/empty. Consider simplifying to just use the getattr default, or clarify the intent if empty string/0 values need special handling.

Suggested change
self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant") or "constant").lower()
self.base_weight = float(getattr(config, "crepa_lambda", 0.5) or 0.0)
self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0) or 0)
raw_decay_steps = getattr(config, "crepa_decay_steps", 0) or 0
self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps
self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0) or 0.0)
self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0) or 0)
self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None)
if self.similarity_threshold is not None:
self.similarity_threshold = float(self.similarity_threshold)
raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None)
self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99
self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent") or "permanent").lower()
self.power = float(getattr(config, "crepa_power", 1.0) or 1.0)
self.scheduler_type = str(getattr(config, "crepa_scheduler", "constant")).lower()
self.base_weight = float(getattr(config, "crepa_lambda", 0.5))
self.warmup_steps = int(getattr(config, "crepa_warmup_steps", 0))
raw_decay_steps = getattr(config, "crepa_decay_steps", 0)
self.decay_steps = int(raw_decay_steps) if int(raw_decay_steps) > 0 else max_train_steps
self.lambda_end = float(getattr(config, "crepa_lambda_end", 0.0))
self.cutoff_step = int(getattr(config, "crepa_cutoff_step", 0))
self.similarity_threshold = getattr(config, "crepa_similarity_threshold", None)
if self.similarity_threshold is not None:
self.similarity_threshold = float(self.similarity_threshold)
raw_ema_decay = getattr(config, "crepa_similarity_ema_decay", None)
self.similarity_ema_decay = float(raw_ema_decay) if raw_ema_decay is not None else 0.99
self.threshold_mode = str(getattr(config, "crepa_threshold_mode", "permanent")).lower()
self.power = float(getattr(config, "crepa_power", 1.0))

Copilot uses AI. Check for mistakes.
Comment on lines +303 to +304
# Early exit if weight is zero (cutoff active or decayed to zero)
if scheduled_weight == 0:
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using exact equality comparison with floating point (line 304) could be problematic due to floating point precision. Consider using a small epsilon for comparison (e.g., if scheduled_weight < 1e-8:) or if abs(scheduled_weight) < 1e-8: to handle potential numerical precision issues, especially since weights are computed through mathematical operations like cosine and polynomial functions.

Suggested change
# Early exit if weight is zero (cutoff active or decayed to zero)
if scheduled_weight == 0:
# Early exit if weight is effectively zero (cutoff active or decayed to ~0)
if abs(scheduled_weight) < 1e-8:

Copilot uses AI. Check for mistakes.
@bghira bghira merged commit a49bb25 into main Jan 17, 2026
3 checks passed
@bghira bghira deleted the feature/crepa-scheduling branch January 17, 2026 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature] CREPA coefficient decay

2 participants