- MI Weight (Mutual Information): Guides splits toward informative features
- Entropy Calculation: Measures information gain for minority classes
- Adaptive Weighting: Auto-adjusts based on imbalance severity
- Implementation:
OptimizedShannonLossinsrc/loss.rs
- Newton-Raphson Method: Second-order optimization
- Gradient & Hessian: Computed per sample for accurate updates
- Learning Rate: Adaptive shrinkage (0.045-0.05)
- Additive Trees: Sequential ensemble building
- Quantile Binning: Up to 32 bins per feature
- Fast Splitting: O(bins × features) instead of O(samples × features)
- Missing Value Handling: Median imputation
- Implementation:
OptimizedHistogramBuilderinsrc/histogram_builder.rs
- Dataset Profiling: Analyzes size, imbalance, complexity
- Principled Tuning:
auto_tune_principled()insrc/auto_tuner.rs - Parameters Auto-Set:
- Learning rate (0.045-0.05)
- Max depth (4-6)
- Min child weight (1.5-20.0)
- Regularization (lambda, gamma)
- MI weight (0.1-0.3)
- Early stopping rounds (45-92)
- Automatic Class Weighting:
scale_pos_weightcomputed from data - Severity Classification: High (< 10%), Moderate (10-20%), Balanced (> 20%)
- Adaptive Parameters: Depth, learning rate, min_child_weight adjust to imbalance
- Reg Lambda: Penalizes large leaf weights (0.2-0.45)
- Prevents Overfitting: Especially on minority classes
- Gamma: Minimum gain required for split (0.1)
- Tree Pruning: Removes low-gain splits
- Row Subsampling: 80% of samples per tree
- Column Subsampling: 70-80% of features per tree
- Reduces Variance: Improves generalization
- Rayon-Based: Multi-threaded tree building
- Adaptive Thresholds: Auto-detects CPU cores and RAM
- Smart Parallelism: Only parallelizes when beneficial (n_features > 20 or n_samples > 5000)
- Implementation:
src/adaptive_parallel.rs
- Transposed Data: Column-major for cache efficiency
- Cached Histograms: Reused across splits
- Loop Unrolling: 4x unroll in histogram building
- SIMD-Ready: Prepared for vectorization
- Validation-Based: Monitors PR-AUC on validation set
- Smoothed Metrics: 3-iteration moving average
- Patience: Stops after N rounds without improvement
- Saves Time: Avoids unnecessary iterations
- Binary Decomposition: N binary classifiers for N classes
- Parallel Training: All classifiers train simultaneously
- Softmax Normalization: Calibrated probability outputs
- Implementation:
src/multiclass.rs
- Independent Tuning: Each binary task auto-configures
- Local Imbalance: Adapts to class-specific ratios
- Optimal Parameters: Different depth/LR per class
- Shallow Trees: Depth 4-5 (vs 6+ in XGBoost)
- High Regularization: Lambda 0.4-0.45
- Prevents Overfitting: To training distribution
- Distribution-Adaptive: Bins adjust to feature scales
- Robust to Shifts: Handles covariate drift
- Median Imputation: Stable missing value handling
- Shannon Guidance: Prioritizes informative, stable features
- Noise Resistance: MI weight filters out noisy features
- K-Means Clustering: Divides data into partitions
- Specialist Models: One PKBoost per partition
- SimSIMD Integration: SIMD-accelerated distance calculations
- Implementation:
src/partitioned_classifier.rs
- Per-Partition Tracking: Error monitoring via EMA
- Automatic Detection: 30% error increase threshold
- Selective Metamorphosis: Retrain only drifted partitions
- 165x Faster: Than full retraining
- Incremental Updates: Only affected partitions
- Production-Ready: Real-time drift response
- PR-AUC: Precision-Recall Area Under Curve
- ROC-AUC: Receiver Operating Characteristic
- F1 Score: Harmonic mean of precision/recall
- Macro-F1: Unweighted average across classes
- Weighted-F1: Sample-weighted average
- Implementation:
src/metrics.rs
- Optimal Cutoff: Finds best classification threshold
- F1 Maximization: Balances precision and recall
- Continuous Targets: Supports regression tasks
- Huber Loss: Robust to outliers
- MSE/MAE: Standard regression losses
- Implementation:
src/regression.rs
- MAD-Based: Median Absolute Deviation
- Automatic Flagging: Identifies anomalous samples
- Robust Regression: Combines MSE and MAE
- Outlier Resistance: Less sensitive than squared loss
- Tunable Delta: Adjustable transition point
- Implementation:
src/huber_loss.rs
- Adaptive Regression: Continuous learning under drift
- Gradient Clipping: Prevents gradient explosion
- Heteroscedasticity Detection: Variance change monitoring
- Residual Autocorrelation: Temporal drift detection
- System State Tracking: Model health monitoring
- Implementation:
src/living_regressor.rs
- Centralized Configuration: Magic numbers in one place
- Maintainability: Easy parameter tuning
- Implementation:
src/constants.rs
- Batched Processing: Handles large datasets
- Streaming Predictions:
predict_proba_batch() - Configurable Batch Size: Memory vs speed trade-off
- Usage Tracking: Monitors feature importance
- Dead Feature Detection: Identifies unused features
- Tree Pruning: Removes trees dependent on dead features
- Implementation:
src/metabolism.rs
- Vulnerability Scoring: Detects model weaknesses
- Adaptive Retraining: Triggers metamorphosis
- Implementation:
src/adversarial.rs
- Dynamic Adaptation: Continuous learning
- System State Monitoring: Tracks model health
- Automatic Recovery: Self-healing mechanisms
- Implementation:
src/living_booster.rs
- Header Detection: Automatic column parsing
- Missing Values: NaN handling with median imputation
- Type Inference: Automatic feature/label separation
- Vec<Vec>: Standard Rust vectors
- No External Dependencies: Pure Rust implementation
- Memory Efficient: Streaming-friendly design
- PKBoostBuilder: Chainable configuration
- Optional Parameters: Sensible defaults
- Type Safety: Compile-time validation
- Zero Configuration:
PKBoost::auto(&x, &y) - One-Line Training: Minimal code required
- Production Ready: No tuning needed
- Optional Validation: Early stopping support
- Stratified Sampling: Maintains class balance
- Cross-Validation Ready: Easy integration
- Training Progress: Iteration-by-iteration metrics
- Diagnostic Info: Gradient norms, prediction ranges
- Debug Mode: Detailed internal state
- Rust Structs: Native serialization support
- Lightweight: Minimal storage overhead
- Fast Loading: Quick model deployment
- XGBoost Comparison: Side-by-side evaluation
- LightGBM Comparison: Performance metrics
- Multiple Datasets: Credit Card, Iris, Dry Bean, etc.
- Implementation:
src/bin/benchmark*.rs
- Synthetic Drift: Controlled noise injection
- Real-World Drift: Temporal distribution shifts
- Degradation Analysis: Performance tracking
- README.md: Quick start guide
- BENCHMARKS.md: Detailed results
- MULTICLASS.md: Multi-class usage
- SHANNON_ANALYSIS.md: Entropy impact
- DRYBEAN_DRIFT_RESULTS.md: Drift resilience
- Binary Classification: Credit card fraud
- Multi-Class: Iris, Dry Bean
- Regression: Continuous targets
- Drift Handling: Adaptive retraining
- PyO3 Support: Rust-Python bridge
- NumPy Compatible: Array interface
- Scikit-Learn Style: Familiar API
- Implementation:
src/python_bindings.rs
- Error Entropy: Measures drift severity via Shannon entropy
- Temporal Patterns: Residual autocorrelation detection
- Variance Changes: Heteroscedasticity scoring
- Drift Classification: Systemic vs Localized vs Feature Shift
- Combined Drift Scoring: Multi-factor drift assessment
- Implementation:
DriftDiagnosticsinsrc/living_regressor.rs
- Conservative: Minimal changes, high stability
- DataAware: Adapts to concept drift patterns
- FeatureAware: Responds to feature distribution shifts
- Auto-Selection: Chooses strategy based on drift analysis
- Implementation:
MetamorphosisStrategyinsrc/living_booster.rs
- Ensemble Variance: Measures prediction confidence
- Uncertainty Estimation:
predict_with_uncertainty() - Confidence Intervals: Standard deviation of predictions
- Implementation:
PKBoostRegressor::predict_with_uncertainty()rd fraud - Multi-Class: Iris, Dry Bean
- Regression: Continuous targets
- Drift Handling: Adaptive retraining
- PyO3 Support: Rust-Python bridge
- NumPy Compatible: Array interface
- Scikit-Learn Style: Familiar API
- Implementation:
src/python_bindings.rs
// Automatic handling - zero configuration
let mut model = OptimizedPKBoostShannon::auto(&x_train, &y_train);
model.fit(&x_train, &y_train, None, true)?;- Auto Mode: Automatically detects imbalance and adjusts parameters
- Shannon Entropy: MI weight prioritizes minority class (0.1-0.3)
- Class Weighting:
scale_pos_weightcomputed from data - Conservative Depth: Prevents overfitting to majority class
// Real-time adaptation with HAB
let mut hab = PartitionedClassifier::new(PartitionConfig::default());
hab.fit(&x_train, &y_train, None, true)?;
// Detect and adapt
let drifted = hab.observe_batch(&new_data, &new_labels);
if !drifted.is_empty() {
hab.metamorph_partitions(&drifted, &buffer_x, &buffer_y, true)?;
}- HAB: 165x faster retraining than full model
- EMA Tracking: Automatic drift detection (30% error threshold)
- Selective Retraining: Only affected partitions updated
- Conservative Architecture: 2-17x more drift-resilient
// Optimized for large datasets
let probs = model.predict_proba_batch(&x_test, 1000)?; // Batch size 1000- Histogram Binning: O(bins × features) vs O(samples × features)
- Adaptive Parallelization: Auto-detects hardware (cores, RAM)
- SIMD-Ready: SimSIMD integration for distance calculations
- Loop Unrolling: 4x unroll in histogram building
// Feature importance and pruning
let usage = model.get_feature_usage();
let pruned = model.prune_trees(&dead_features, 0.5);- Feature Usage: Tracks which features are used
- Tree Pruning: Removes trees dependent on dead features
- Feature Metabolism: Monitors feature importance over time
- Verbose Logging: Detailed training diagnostics
// One-vs-Rest with per-class tuning
let mut model = MultiClassPKBoost::new(n_classes);
model.fit(&x_train, &y_train, None, true)?;
let probs = model.predict_proba(&x_test)?; // [n_samples, n_classes]- OvR Strategy: N binary classifiers for N classes
- Parallel Training: All classifiers train simultaneously
- Per-Class Auto-Tuning: Each binary task optimized independently
- Softmax Normalization: Calibrated probability outputs
// Robust regression with outlier handling
let mut model = PKBoostRegressor::auto(&x_train, &y_train);
model.fit(&x_train, &y_train, Some((&x_val, &y_val)), true)?;- Huber Loss: Robust to outliers
- Living Regressor: Adaptive learning under drift
- Gradient Clipping: Prevents gradient explosion
- Heteroscedasticity Detection: Variance change monitoring
- Total Features: 45
- Core Modules: 20+
- Binary Examples: 15+
- Supported Tasks: Binary, Multi-Class, Regression
- Datasets Tested: 10+
- Lines of Code: ~5,000+
- Performance: 10-40% better than XGBoost/LightGBM on imbalanced data
- Drift Resilience: 2-17x better than competitors
┌─────────────────────────────────────┐
│ What's your task? │
└─────────────┬───────────────────────┘
│
┌────┴────┬─────────────┬──────────────┐
│ │ │ │
Binary Multi-Class Regression Streaming
│ │ │ │
▼ ▼ ▼ ▼
┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐
│Auto │ │Multi│ │Auto │ │Living│
│Mode │ │Class│ │Reg │ │Boost│
└─────┘ └─────┘ └─────┘ └─────┘
│ │ │ │
└────┬────┴─────┬───────┴──────┬───────┘
│ │ │
Expect drift? │ High drift?
│ │ │
Yes │ No │ Yes │ No
▼ │ ▼ │
┌─────┐ │ ┌─────┐ │
│ HAB │ │ │Living│ │
└─────┘ │ │Reg │ │
│ └─────┘ │
└─────────────────────┘
│
Production
Ready!
| Model | Training Time | Prediction Time | Memory |
|---|---|---|---|
| PKBoost | 92.7s | 0.15s | 145 MB |
| XGBoost | 12.0s | 0.18s | 210 MB |
| LightGBM | 11.2s | 0.12s | 190 MB |
| Model | PR-AUC | F1 Score | Recall@90% Precision |
|---|---|---|---|
| PKBoost | 0.878 | 0.874 | 0.812 |
| XGBoost | 0.745 | 0.798 | 0.612 |
| LightGBM | 0.793 | 0.713 | 0.645 |
| Model | Baseline Acc | Drifted Acc | Degradation |
|---|---|---|---|
| PKBoost | 92.54% | 92.14% | -0.43% |
| XGBoost | 92.25% | 91.41% | -0.91% |
| LightGBM | 92.36% | 91.85% | -0.55% |
- Full Retraining: 92.7s
- HAB Metamorphosis: 0.56s (165x faster)
- Selective Partition Update: 0.08s per partition
Solution: Increase regularization
let model = PKBoostBuilder::new()
.reg_lambda(2.0) // Default: 1.0
.gamma(0.2) // Default: 0.1
.subsample(0.7) // Default: 0.8
.build_with_data(&x, &y);Solution: Reduce complexity or use batching
let model = PKBoostBuilder::new()
.n_estimators(500) // Default: 1000
.max_depth(4) // Default: 6
.histogram_bins(16) // Default: 32
.build_with_data(&x, &y);
// Or use batched prediction
let probs = model.predict_proba_batch(&x_test, 1000)?;Solution: Use auto mode (already optimized) or increase MI weight
// Auto mode handles this automatically
let model = OptimizedPKBoostShannon::auto(&x, &y);
// Or manually tune
let model = PKBoostBuilder::new()
.mi_weight(0.4) // Higher entropy guidance
.scale_pos_weight(10.0) // Upweight minority class
.build_with_data(&x, &y);Solution: Already handled! Gradient clipping is automatic
// Living regressor automatically clips gradients > 5000
// See living_regressor.rs - automatic protection
// Monitor via verbose logging if neededSolution: Check buffer size and validation set
// Ensure buffer has enough samples
let mut booster = AdversarialLivingBooster::new(&x_train, &y_train);
// Need at least 1000 samples in buffer for retraining
// Need at least 100 samples for validation
// Automatic rollback if metamorphosis degrades performance// Static model
OptimizedPKBoostShannon::auto(&x, &y)
.fit(x, y, eval_set, verbose) -> Result
.predict_proba(x) -> Result<Vec<f64>, String>
.predict_proba_batch(x, batch_size) -> Result<Vec<f64>, String>
.prune_trees(features, threshold) -> usize
.get_feature_usage() -> Vec<usize>
// Adaptive model (binary classification)
AdversarialLivingBooster::new(&x, &y)
.fit_initial(x, y, eval_set, verbose) -> Result
.observe_batch(x, y, verbose) -> Result
.get_state() -> SystemState
.get_metamorphosis_count() -> usize
.get_vulnerability_score() -> f64
// Partitioned model (165x faster adaptation)
PartitionedClassifier::new(config)
.partition_data(x, y, verbose)
.train_specialists(x, y, verbose) -> Result
.observe_batch(x, y) -> Vec<usize> // Returns drifted partitions
.metamorph_partitions(partition_ids, x, y, verbose) -> Result
// Regression
PKBoostRegressor::auto(&x, &y)
.fit(x, y, eval_set, verbose) -> Result
.predict(x) -> Result<Vec<f64>, String>
.predict_with_uncertainty(x) -> Result<(Vec<f64>, Vec<f64>), String>
// Multi-class
MultiClassPKBoost::new(n_classes)
.fit(x, y, eval_set, verbose) -> Result
.predict_proba(x) -> Result<Vec<Vec<f64>>, String>
.predict(x) -> Result<Vec<usize>, String>
// Builder pattern
PKBoostBuilder::new()
.n_estimators(1000)
.learning_rate(0.05)
.max_depth(6)
.mi_weight(0.3)
.reg_lambda(1.0)
.gamma(0.1)
.subsample(0.8)
.colsample_bytree(0.8)
.build_with_data(&x, &y)SystemState { Normal, Alert { checks_in_alert }, Metamorphosis }
MetamorphosisStrategy { Conservative, DataAware, FeatureAware }
DriftType { Systemic, Localized, FeatureShift }
RegressionLossType { MSE, Huber { delta } }- SHAP-like values for prediction explanation
- Tree visualization (Graphviz export)
- Feature importance plots
- Partial dependence plots
- Kolmogorov-Smirnov test for distribution shift
- Population Stability Index (PSI)
- Per-feature drift monitoring
- Automatic drift severity classification
- Bagging of PKBoost models
- Stacking with PKBoost as base learner
- Out-of-bag error estimation
- Ensemble diversity metrics
- Platt scaling
- Isotonic regression
- Reliability diagrams
- Expected Calibration Error (ECE)
- Comprehensive error types (
PKBoostErrorenum) - Input validation methods
- Graceful degradation
- Error recovery mechanisms
- Unit tests for all modules
- Integration tests
- Property-based testing
- Benchmark regression tests
- Serde support for JSON/bincode
- Model versioning
- Backward compatibility
- Compression support
- Scikit-learn compatible
get_params()/set_params() -
feature_importances_property -
apply()method for leaf indices - GridSearchCV compatibility
| Feature | PKBoost | XGBoost | LightGBM |
|---|---|---|---|
| Shannon Entropy | ✅ | ❌ | ❌ |
| Auto-Tuning | ✅ | ❌ | ❌ |
| Imbalance Detection | ✅ | ❌ | ❌ |
| Multi-Class OvR | ✅ | ✅ | ✅ |
| Drift Detection | ✅ | ❌ | ❌ |
| Adaptive Retraining | ✅ | ❌ | ❌ |
| Histogram Binning | ✅ | ✅ | ✅ |
| Parallel Training | ✅ | ✅ | ✅ |
| Early Stopping | ✅ | ✅ | ✅ |
| Built-in Metrics | ✅ | ❌ | ❌ |
| Zero Configuration | ✅ | ❌ | ❌ |
| Rust Native | ✅ | ❌ | ❌ |
PKBoost: The most feature-complete gradient boosting library for imbalanced data under drift.