Skip to content

Conversation

@trivialfis
Copy link
Member

@trivialfis trivialfis commented Jan 19, 2026

  • Return meta info with shape.
  • Use cupy when device is CUDA.

ref #9043

  • Track deprecated methods.

- Return meta info with shape.
- Use cupy when device is CUDA.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new getter method for matrix information that returns data with proper shape information using the array interface protocol. The implementation automatically uses CuPy arrays when the device is CUDA, improving GPU interoperability.

Changes:

  • Added new MetaField enum and MapMetaField function to standardize field name handling
  • Implemented new GetInfo overload that returns array interface strings for device-aware data access
  • Added XGDMatrixGetArrayInfo C API function to expose the new functionality
  • Refactored Python getters (get_label, get_weight, get_base_margin) to use the new _get_info method

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File Description
src/data/data.cc Added MetaField enum, MapMetaField function, and new GetInfo overload for array interface; refactored existing GetInfo to use switch statements
src/c_api/c_api.cc Added XGDMatrixGetArrayInfo function to expose new functionality through C API
python-package/xgboost/core.py Added _get_info method and updated get_label, get_weight, get_base_margin to use it
include/xgboost/data.h Added declaration for new GetInfo overload

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1253 to 1264
def get_weight(self) -> np.ndarray:
"""Get the weight of the DMatrix.
Returns
-------
weight : array
"""
return self.get_float_info("weight")
"""Get the weight of the DMatrix."""
return self._get_info("weight")

def get_base_margin(self) -> np.ndarray:
"""Get the base margin of the DMatrix.
Returns
-------
base_margin
"""
return self.get_float_info("base_margin")
"""Get the base margin of the DMatrix."""
return self._get_info("base_margin")
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new behavior of get_weight and get_base_margin returning CuPy arrays on CUDA devices lacks test coverage. Add tests similar to the existing test_metainfo in test_device_quantile_dmatrix.py that verifies these methods return CuPy arrays when the DMatrix is on a CUDA device.

Copilot uses AI. Check for mistakes.
case MetaField::kGroupPtr: {
aif = linalg::ArrayInterfaceStr(linalg::MakeVec(this->group_ptr_));
break;
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch statement is missing a default case to handle unknown field names. This means if an invalid field is passed to GetInfo, it will silently return an empty string instead of providing a clear error message. Add a default case that logs a fatal error, similar to the switch statement at lines 655-683.

Suggested change
}
}
default: {
LOG(FATAL) << "Unknown field name: " << key;
break;
}

Copilot uses AI. Check for mistakes.
Comment on lines 881 to 894
XGB_DLL int XGDMatrixGetArrayInfo(DMatrixHandle handle, char const *field, char const **out_array) {
API_BEGIN();
CHECK_HANDLE();
xgboost_CHECK_C_ARG_PTR(field);

auto p_fmat = CastDMatrixHandle(handle);
MetaInfo const &info = p_fmat->Info();

auto &res = p_fmat->GetThreadLocal().ret_str;
info.GetInfo(p_fmat->Ctx(), StringView{field}, &res);

*out_array = res.c_str();
API_END();
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new C API function XGDMatrixGetArrayInfo is implemented but not declared in the public C API header file (include/xgboost/c_api.h). Add a function declaration with proper documentation following the same pattern as XGDMatrixGetFloatInfo and XGDMatrixGetUIntInfo.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant