Skip to content

Conversation

t4c1
Copy link

@t4c1 t4c1 commented May 19, 2025

Enables iteration of prefetch atom to cover prefetch tile. In other words relaxes the requirement for the prefetch tile size to match prefetch atom size.

This is done by using the same path for prefetch that nvidia code uses - going through copy implementation.

This PR also:

  • includes a lot of bugfixes for prefetch atom layouts
  • adds some missing copy traits
  • removes some duplicated prefetch implementations (_V atoms duplicating what is already in _N)

Copy link

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@rolandschulz
Copy link

There is no need to finish this giving the move to the rearchitected atoms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants