Skip to content

Conversation

ysiraichi
Copy link
Collaborator

This PR refactors our error handling by replacing GetValueOrThrow with proper status propagation using absl::StatusOr<T> and XLA_ASSIGN_OR_RETURN macros.

Key Changes:

  • ReleaseGilAndTransferData Function:

    • Updated the function signature to return absl::StatusOr<std::vector<xla::Literal>>.
    • Replaced GetComputationClientOrDie() with GetComputationClient().
    • Utilized XLA_ASSIGN_OR_RETURN for client acquisition and TransferFromDevice calls.
    • Updated callers in tensor_util.cpp and xla_graph_executor.cpp to handle the new StatusOr<T> return type.
  • XlaDataToTensors Function:

    • Modified the function signature to return absl::StatusOr<std::vector<at::Tensor>>.
    • Replaced GetValueOrThrow with XLA_ASSIGN_OR_RETURN for the ReleaseGilAndTransferData call.
    • Updated all callers (including XLATensor::ToTensor, test_xla_sharding.cpp, init_python_bindings.cpp, and xla_backend_impl.cpp) to correctly handle the StatusOr<T> return type.
    • Added necessary status.h includes to xla_backend_impl.cpp and test_xla_sharding.cpp.

These modifications align with existing status propagation patterns in the codebase, as seen in pjrt_registry.cpp, and maintain API-level backward compatibility while improving internal error handling within the tensor conversion pipeline.

@ysiraichi

This comment was marked as outdated.

@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch from 9d505e7 to 5d4742b Compare July 1, 2025 16:44
@ysiraichi ysiraichi force-pushed the ysiraichi/status-for-oom-errors branch from 247fdf5 to b390a61 Compare July 1, 2025 18:11
@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch from 5d4742b to 40a75d7 Compare July 1, 2025 18:11
@ysiraichi ysiraichi force-pushed the ysiraichi/status-for-oom-errors branch from b390a61 to 821c384 Compare July 1, 2025 18:15
@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch 2 times, most recently from b0e25da to 97ef4c1 Compare July 3, 2025 14:41
@ysiraichi ysiraichi force-pushed the ysiraichi/status-for-oom-errors branch from 821c384 to 08c5ecd Compare July 3, 2025 14:41
@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch from 97ef4c1 to de09876 Compare July 3, 2025 15:42
@ysiraichi ysiraichi force-pushed the ysiraichi/status-for-oom-errors branch 3 times, most recently from 34abde6 to 103cd0f Compare July 24, 2025 12:40
@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch from de09876 to bfeb70a Compare July 25, 2025 14:51
@ysiraichi ysiraichi changed the base branch from ysiraichi/status-for-oom-errors to master July 25, 2025 15:11
@ysiraichi ysiraichi marked this pull request as ready for review July 28, 2025 12:46
@ysiraichi ysiraichi force-pushed the ysiraichi/propagate-status-for-oom branch from bfeb70a to f64ae9e Compare July 28, 2025 12:48
Copy link
Collaborator

@zhanyong-wan zhanyong-wan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ysiraichi ysiraichi merged commit 1ed6b46 into master Jul 29, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants