Skip to content

Conversation

@fmassa
Copy link
Contributor

@fmassa fmassa commented Sep 29, 2025

Taken from #3 and #29. Decomposing softmax_backward leads to prims.fma, which doesn't have a sharding rule and we end up having a Replicate showing up as only possible sharding

Taken from #3 and #29. Decomposing softmax_backward leads to prims.fma, which doesn't have a sharding rule and we end up having a Replicate showing up as only possible sharding
@fmassa fmassa requested review from ezyang and zpcore September 29, 2025 15:31
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 29, 2025
Copy link

@eellison eellison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse a pointwise sharding rule if Pointwise tag is present in the operator ?

@fmassa
Copy link
Contributor Author

fmassa commented Sep 29, 2025

Can we reuse a pointwise sharding rule if Pointwise tag is present in the operator ?

Yeah, I think we should definitely be able to do! cc @zpcore as this would make a lot of things better

@eellison
Copy link

eellison commented Sep 29, 2025

pr here to add the tag: pytorch/pytorch#164149

separately, i'm trying to get landed a pr that adds tags for reductions: pytorch/pytorch#153342

@fmassa fmassa merged commit 6cd2133 into main Oct 1, 2025
6 checks passed
@fmassa fmassa deleted the fmassa/softmax_nodecomp branch October 1, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants