Skip to content

Conversation

nstarman
Copy link
Collaborator

@nstarman nstarman commented Jul 1, 2025

Requires #32. I'll rebase when that's in.

Now preceding #32.

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 1, 2025

@jorenham @NeilGirdhar @lucascolley this doesn't resolve the current discussion re a DTypeT typevar or a DType protocol, but does mean Arrays can now understand any typevar or protocol we want to add.

Copy link
Member

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the Has* protocols. Optype will cover the CanArray* ones I think. There might be some that can't be expressed because Self can't be passed as generic type argument, but I suppose we can deal with it when needed.

One nit is that ... are not needed when there's a docstring, which already counts as an expression or statement or something.

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 2, 2025

I like the Has* protocols. Optype will cover the CanArray* ones I think. There might be some that can't be expressed because Self can't be passed as generic type argument, but I suppose we can deal with it when needed.

What do we do about docstrings?

One nit is that ... are not needed when there's a docstring, which already counts as an expression or statement or something.

Yes, my pylint is complaining. But IMO empty methods should have ... to distinguish them from ones that aren't empty. Helps in understanding inheritance with a Protocol.

@jorenham
Copy link
Member

jorenham commented Jul 2, 2025

What do we do about docstrings?

Hmm good question. Maybe a numpy-esque add_docstring function?

@jorenham
Copy link
Member

jorenham commented Jul 2, 2025

Yes, my pylint is complaining. But IMO empty methods should have ... to distinguish them from ones that aren't empty. Helps in understanding inheritance with a Protocol.

Yea I guess there's something to be said for that. It just looks a bit weird to me to see a ... occupy a line on its own (but that might have something to do with me spending most of my breathing time looking at stubs).

@nstarman nstarman force-pushed the has_x_attributes branch 2 times, most recently from 8096aff to 3793c1f Compare July 21, 2025 15:56
@nstarman nstarman force-pushed the has_x_attributes branch 5 times, most recently from cb3dbb3 to dec1842 Compare July 23, 2025 21:05
@nstarman nstarman marked this pull request as ready for review July 23, 2025 21:10
@nstarman nstarman requested a review from jorenham July 23, 2025 21:10
@nstarman
Copy link
Collaborator Author

@jorenham I added tests and made the protocols public.

@nstarman
Copy link
Collaborator Author

@jorenham This should cover all the array attributes.

@nstarman
Copy link
Collaborator Author

nstarman commented Aug 1, 2025

@jorenham it might be easier to merge this before doing numpy type compat stuff from #32.

@nstarman nstarman added this to the v2021-12-0.0 milestone Aug 1, 2025
@nstarman nstarman added ✨ feature Introduce new features. ✅ tests Add, update, or pass tests. labels Aug 1, 2025
...


class HasSize(Protocol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the use-case of this one? Is there anything this can help with, that HasShape can't?
Put differently; should we make this public API or not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we have Protocols for all attributes in the Array API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why we should write a protocol if there's no use for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def is_sized(obj: Any, /) -> TypeGuard[HasSize]: ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, and what would you use that for in a real-world scenario

Copy link
Collaborator Author

@nstarman nstarman Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK personally, but the Array API is pretty sparing.
I could see using something like:

def check_min_size(obj: object) -> bool:
    if not is_sized(obj) or obj.size is None:
        raise Exception
    return obj.size > 10

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yea I guess, but that's probably pretty niche, right?

I guess I just feel like we should first start with fixing the big typing problems that the array-api libs are facing, and I don't consider this to be one of those. That's why I opened #22, so that we can get a better understanding of what those problems actually are. And right now it just feels like we're trying to mimick the runtime API, and blindly hope that it will somehow be helpful, without knowing how exactly.

Copy link
Collaborator Author

@nstarman nstarman Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not disagreeing that there are other big issues to solve, but people also want and need the more straightforward static Array API. I've had many people ask me about doing straightforward things like

def square[T=Array](x: T) -> T: ...

So yeah, I do think there's a lot of value in starting to fill out the static equivalent to the runtime API.
We know it will be useful. And how.


Yes, for square specifically they should annotate this with some form of CanMul. But as we've been discussing in #32 writing this type is non-trivial and array_api_typing should probably be the space where we vendor something like

NumArrayCanMul = CanMulSelf[..., ...]  # fills in the complex types
BoolArrayCanMul = CanMulSelf[..., ...]  # The bool considerations we discussed

so that higher-level functions that require many parts of the Array API can do

def main():
    x = xp.array(...)
    ...
    ... square(...)

HasNDim,
HasShape,
HasSize,
HasTranspose,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This HasTranspose restricts this to 2d arrays

Copy link
Collaborator Author

@nstarman nstarman Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it? I don't know of any array object that doesn't have this method, regardless of the value's dimensionality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.transpose() is only defined for 2d arrays, so we should expect array libs to follow that, and we should therefore expect there to be annotations like

class SomeArray[ShapeT, DTypeT]:
    # --snip--
    def transpose(self: SomeArray[Rank2, DTypeT]) -> SomeArray[Rank2, DTypeT]: ...

So a SomeArray[Rank1] won't be assignable to xpt.Array.

Copy link
Collaborator Author

@nstarman nstarman Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great.
It sounds like there are two interpretations of how this should be represented statically given the API spec: "The array instance must be two-dimensional. If the array instance is not two-dimensional, an error should be raised."

  1. That it statically determines the shape. This focuses on the 1st sentence.
@property
def T(self: SomeArray[Rank2, DTypeT]) -> SomeArray[Rank2, DTypeT]: ...
  1. that it shadow the runtime. This focuses on the 2nd sentence.
@overload @property
def T(self: SomeArray[Rank0 | Rank1, DTypeT]) -> Never: ...
@overload @property
def T(self: SomeArray[Rank2, DTypeT]) -> SomeArray[Rank2, DTypeT]: ...
@overload @property
def T(self: SomeArray[RankGE3, DTypeT]) -> Never: ...

(or if we don't have shape typing)

def T(self: SomeArray) -> SomeArray | Never: ...  # yes, the Never is ignored.

This was written more with the 2nd interpretation in mind, but I'm very happy to accept the 1st interpretation and remove T from the default Array object, so long as we still include some form of HasTranspose Protocol for people to use.
I do think there is then interesting discussion about that Protocol ....
Should we vendor 2 forms? one for each interpretation? That will help people when building their own intersection types.

...


class HasNDim(Protocol):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a situation where you'd wanna use HasNDim over e.g. HasShape? Otherwise we probably should keep this private, given that

There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.

Copy link
Collaborator Author

@nstarman nstarman Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're checking for the attribute ndim.
Not sure I'm understanding this comment. Ndim is in the Array API spec...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, my point is that I think we should only provide users with protocols that help them annotate their array-api code. Otherwise, it'll just be confusing for the users, and a waste of time for us.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. As the Array API is the intersection of the array libraries, IMO pretty much everything is useful.

Copy link
Member

@jorenham jorenham Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea of course, everything in the array-api is designed for a reason. But all of those reasons apply in the runtime world. And our goal is to provide an API for the static-typing world, which is but a shadow of the runtime one, where only a subset of the API has practical use.

Copy link
Collaborator Author

@nstarman nstarman Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I'm just not sure we could really know which of the specific methods are/aren't useful for people in every situation. I agree we get more control over what goes into Array, but shouldn't we provide a Protocol for each individual method and attribute?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but shouldn't we provide a Protocol for each individual method and attribute?

We're not a stub library, so no, I don't think we should.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need a stub library... Maybe that's the easier thing to do first.

Copy link
Member

@jorenham jorenham Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need a stub library... Maybe that's the easier thing to do first.

Stub libraries only apply to individual libraries. Stub libraries also can't be overridden or something, so libraries that have other methods besides the one from the array-api won't be able to use it. So I don't see how that would work (even though I'm all about writing stubs, as you know)

Copy link
Collaborator Author

@nstarman nstarman Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I guess I mean a direct method-by-method set of Protocols for the array API, like is in this PR.
I know this is what many people want, as a baseline at the very least. IMO this is the place for it.

Signed-off-by: nstarman <[email protected]>
Signed-off-by: Nathaniel Starkman <[email protected]>
Signed-off-by: nstarman <[email protected]>
@nstarman
Copy link
Collaborator Author

nstarman commented Aug 22, 2025

@jorenham. I can see your points about some of the dangers of building the Array intersection protocol from all these method-by-method protocols before we've figured out shape typing, etc.
If we can agree to have method-by-method Protocols for https://data-apis.org/array-api/2021.12/API_specification/index.html#api-specification-index--page-root then we can pause on making Array while we more quickly build out the underlying protocols.
I imagine there will be some revisions to the individual Protocols necessary when stitching them together into various Array-like protocols that we didn't see when writing them individually, but I think these are separable steps and concerns and we should fill out the namespace ASAP.

@nstarman nstarman mentioned this pull request Aug 22, 2025
@nstarman nstarman changed the title feat: HasX attributes ✨: HasX attributes Aug 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ feature Introduce new features. ✅ tests Add, update, or pass tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants