Skip to content

GH-16779: update estimators to support sklearn 1.6+#16780

Open
zazulam wants to merge 1 commit intoh2oai:masterfrom
zazulam:sklearn-update
Open

GH-16779: update estimators to support sklearn 1.6+#16780
zazulam wants to merge 1 commit intoh2oai:masterfrom
zazulam:sklearn-update

Conversation

@zazulam
Copy link

@zazulam zazulam commented Mar 10, 2026

Fixes #16779

Summary

h2o.sklearn wrappers need compatibility updates for newer scikit-learn APIs, especially around estimator type semantics and tags behavior in scikit-learn 1.6+. scikit-learn revamped estimator tags in 1.6.0 (December 2024) and introduced __sklearn_tags__ as the preferred API (with Tags objects). This affects third-party estimator wrappers and type-dispatch behavior (is_classifier, is_regressor, clone/check tooling).

For generic H2O sklearn wrappers (e.g. H2OGradientBoostingEstimator(estimator_type='classifier')), semantics can drift in sklearn integration paths (notably clone + tags/type checks) unless wrapper params and _estimator_type are propagated consistently.

Proposed / implemented fix

In h2o-py/h2o/sklearn/wrapper.py:

  • Keep explicit _estimator_type precedence in classifier/regressor checks.
  • Include estimator_type and init_connection_args in sklearn parameter flow (get_params/set_params) so clone semantics are preserved.
  • Keep these params reserved from forwarding to underlying H2O estimators.
  • Implement __sklearn_tags__ handling for newer sklearn tags API when available.

In h2o-py/tests/testdir_sklearn/pyunit_sklearn_api.py:

  • Add regression tests for:
    • classifier/regressor identification
    • clone preserving estimator_type semantics
    • sklearn tags estimator_type behavior (when available)

Validation run

  • testdir_sklearn/pyunit_sklearn_api.py → PASS
  • testdir_sklearn/pyunit_sklearn_params.py → PASS

Dependency notes

  • No H2O package dependency changes are required for this fix itself.
  • Compatibility validation was performed with latest scikit-learn 1.x available in test env: 1.6.1.
  • Local environment note: running older sklearn wheels with incompatible NumPy can cause ABI errors; this was an environment concern, not a required project dependency change.

Signed-off-by: zazulam <m.zazula@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python sklearn wrappers: fix estimator_type/clone/tag compatibility for scikit-learn 1.6+

1 participant