Skip to content

feat: iterative LLM feature engineering with fit_selective()#32

Merged
RobertoCorti merged 8 commits intomainfrom
feat/iterative-llm-feng
Mar 9, 2026
Merged

feat: iterative LLM feature engineering with fit_selective()#32
RobertoCorti merged 8 commits intomainfrom
feat/iterative-llm-feng

Conversation

@RobertoCorti
Copy link
Copy Markdown
Owner

Summary

  • Add generate_engineered_features_iterative() to LLMInterface — generates feature ideas informed by a previous selection feedback context
  • Add SELECTION_FEEDBACK_PROMPT to guide the LLM with feedback on which prior features were selected/rejected
  • Add fit_selective() to LLMFeatureEngineer — runs multiple generate → evaluate → select → feedback rounds automatically, with an optional verbose flag
  • Adopt sklearn conventions for fitted state: rename public attribute to generated_features_ideas_ (trailing underscore) and remove the redundant generated_features intermediate attribute
  • Add tests for fit_selective, _run_selector, _build_feedback_context, and generate_engineered_features_iterative
  • Update the tutorial notebook with a fit_selective() example

Notes

  • The bin discretization transformation is removed from docs/examples (was never fully supported)
  • generated_features_ideas_ is now the single source of truth for fitted ideas, replacing the old generated_features filtered list

…gineering

Adds the feedback prompt template used to close the generate→select→refine
loop in the upcoming fit_selective() method. The prompt is selector-agnostic
and uses raw scores in markdown tables.
Adds multi-turn conversation support to LLMInterface. The method accepts
a prompt context, conversation history, and optional feedback context to
drive the generate→select→refine loop for fit_selective().
…feedback

Adds fit_selective(), _run_selector(), and _build_feedback_context() to
LLMFeatureEngineer. The selector is fit on all features (original + new)
and a sub-mask is extracted for new features only, so selection reflects
each feature's value in full context.
…text, and generate_engineered_features_iterative
… fit_selective example

- Add verbose=0/1/2 to LLMFeatureEngineer for fit_selective() progress output
- Add sub-step messages in _run_selector (applying transformations, running selector)
- Remove BinTransformation (binning.py, imports, tests, docs references)
- Add fit_selective() section to tutorial notebook
- Rename generated_features_ideas to generated_features_ideas_ (trailing underscore marks fitted attributes per sklearn convention)
- Add skfeaturellm/exceptions.py with custom NotFittedError (inherits ValueError + AttributeError like sklearn)
- Add skfeaturellm/utils/validation.py with check_is_fitted(), replacing ad-hoc hasattr checks
- Extract _format_dataset_statistics and generate_prompt_context from LLMInterface into skfeaturellm/prompts/utils module
- Move feature_evaluator from instance attribute to local variable in evaluate_features
@RobertoCorti RobertoCorti merged commit 5c58f47 into main Mar 9, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant