Add MLX backend #1365

williambdean · 2025-04-11T15:39:43Z

Description

Getting ball rolling started with #1350

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

📚 Documentation preview 📚: https://pytensor--1365.org.readthedocs.build/en/1365/

pytensor/link/pytorch/linker.py

codecov · 2025-04-11T18:11:07Z

Codecov Report

❌ Patch coverage is 76.23188% with 164 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.58%. Comparing base (5227759) to head (e484ba4).
⚠️ Report is 15 commits behind head on main.

Files with missing lines	Patch %	Lines
pytensor/link/mlx/dispatch/core.py	54.76%	60 Missing and 16 partials ⚠️
pytensor/link/mlx/dispatch/elemwise.py	77.08%	55 Missing ⚠️
pytensor/link/mlx/dispatch/math.py	72.91%	11 Missing and 2 partials ⚠️
pytensor/link/mlx/dispatch/blockwise.py	66.66%	4 Missing and 3 partials ⚠️
pytensor/link/mlx/dispatch/basic.py	92.98%	3 Missing and 1 partial ⚠️
pytensor/link/mlx/dispatch/subtensor.py	93.84%	0 Missing and 4 partials ⚠️
pytensor/link/mlx/dispatch/signal/conv.py	87.50%	2 Missing and 1 partial ⚠️
pytensor/link/mlx/dispatch/shape.py	93.10%	1 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (76.23%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1365      +/-   ##
==========================================
- Coverage   81.65%   81.58%   -0.07%     
==========================================
  Files         232      242      +10     
  Lines       53081    53771     +690     
  Branches     9403     9468      +65     
==========================================
+ Hits        43342    43870     +528     
- Misses       7286     7421     +135     
- Partials     2453     2480      +27

Files with missing lines	Coverage Δ
pytensor/compile/mode.py	`85.00% <100.00%> (+0.13%)`	⬆️
pytensor/link/mlx/dispatch/__init__.py	`100.00% <100.00%> (ø)`
pytensor/link/mlx/linker.py	`100.00% <100.00%> (ø)`
pytensor/link/pytorch/linker.py	`100.00% <ø> (ø)`
pytensor/link/mlx/dispatch/shape.py	`93.10% <93.10%> (ø)`
pytensor/link/mlx/dispatch/signal/conv.py	`87.50% <87.50%> (ø)`
pytensor/link/mlx/dispatch/basic.py	`92.98% <92.98%> (ø)`
pytensor/link/mlx/dispatch/subtensor.py	`93.84% <93.84%> (ø)`
pytensor/link/mlx/dispatch/blockwise.py	`66.66% <66.66%> (ø)`
pytensor/link/mlx/dispatch/math.py	`72.91% <72.91%> (ø)`
... and 2 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ricardoV94 · 2025-04-12T12:39:06Z

I suggest basing yourself on the numba linker, torch has a lot of hacks we hopefully don't need here

williambdean · 2025-04-12T20:29:04Z

Thanks for the pointer. I simplified the one method. Do you think that gen_functors can be removed as well? The only commonality with pytorch then is that no input can be numpy array.

ricardoV94 · 2025-04-13T13:19:54Z

Yeah you shouldn't need that you just need a call to tipify on the runtime inputs as well

pytensor/compile/mode.py

pytensor/link/mlx/linker.py

williambdean · 2025-04-19T15:42:46Z

Still need to get this to run:

import pytensor

pytensor.config.mode = "MLX"

cetagostini · 2025-04-23T08:16:45Z

Hey big thanks to @jessegrabowski and @ricardoV94 to help with this PR!

I feel the PR is huge enough. Should we make a first merge and start to iterate on next versions? Cleaning and making all more consistent with other backends.

Thanks to @williambdean to open the PR!

pytensor/link/mlx/dispatch/basic.py

ricardoV94 · 2025-04-24T09:47:06Z

pytensor/link/mlx/dispatch/basic.py

+@mlx_funcify.register(Assert)
+@mlx_funcify.register(CheckAndRaise)
+def mlx_funcify_CheckAndRaise(op, **kwargs):
+    warnings.warn(
+        f"""Skipping `CheckAndRaise` Op (assertion: {op.msg}) as MLX tracing would remove it.""",
+        stacklevel=2,
+    )
+
+    def assert_fn(x, *inputs):
+        return x
+
+    return assert_fn


Is this true, or just copy/pasta from JAX?

I need to check more here!

Note this has been changed in JAX to raise when the condition is known to be False. We should do the same (or just implement an assert, if mlx allows that, I suspect it does not)

So, this was true. I'm moving to a similar JAX way.

ricardoV94 · 2025-04-24T09:47:35Z

pytensor/link/mlx/dispatch/blockwise.py

+from pytensor.tensor.signal.conv import Conv1d
+
+
+def blockwise_conv1d(op, node, **kwargs):


Not needed anymore since they fixed upstream right?

I think we needed still. Where do you see its fixed? We are using this blockwise conv1d.

Sure but blockwise will call vmap on the core op, so we only need to dispatch core Conv1D to MLX Conv1D, then the blockwise variant will work automatically

Can you confirm we don't need this specialized implementation anymore?

Also note that Convolve1d changed in main. Now mode is not a property of the Op but a runtime value of 1 when full and 0 when valid. Check how JAX is implemented now

Okay MLX path now matches the upstream design: blockwise simply vmap’s the core op, so no special MLX-only wrapper is needed anymore. We are effectively using the same contract JAX uses, except JAX has a static mode, while MLX now works for dynamic modes as well :)

pytensor/link/mlx/dispatch/core.py

ricardoV94 · 2025-04-24T09:50:45Z

pytensor/link/mlx/dispatch/elemwise.py

+        # Convert scalar to array if needed
+        if isinstance(x, int | float) or (
+            isinstance(x, np.number) and not isinstance(x, np.ndarray)
+        ):
+            x = mx.array(x)


Should not be needed

MLX’s mx.transpose rejects plain Python or NumPy scalars (TypeError: transpose(): incompatible function arguments), so without the conversion we would crash whenever a 0-d value reaches a dimshuffle.

pytensor vill not send python or numpy scalars to dimshuffle, than means you did something wrong, probably when converting constants

pytensor/link/mlx/dispatch/elemwise.py

ricardoV94 · 2025-04-24T09:53:14Z

pytensor/link/mlx/dispatch/math.py

+
+
+@mlx_funcify.register(Elemwise)
+def mlx_funcify_Elemwise(op, **kwargs):


Like CAReduce it should have a second level dispatch. Also we need to enforce the runtime_broadcastable checks (same in Alloc). And we shoud have a default implementation for that second level dispatch that tries to use getattr(MLX, "func_name") similar to how JAX does it already.

pytensor/link/mlx/dispatch/subtensor.py

pytensor/link/mlx/linker.py

pytensor/link/mlx/dispatch/blockwise.py

pytensor/link/mlx/dispatch/core.py

pytensor/link/mlx/dispatch/math.py

tests/link/mlx/test_shape.py

pytensor/link/mlx/dispatch/elemwise.py

ricardoV94 · 2025-06-03T08:34:35Z

pytensor/link/mlx/dispatch/elemwise.py

+    return softmax_grad
+
+
+@mlx_funcify.register(Softplus)


Delete this? You have one in elemwise already

Where? Can you point?

You have mlx_funcify.register(Softplus) in link/mlx/dispatch/elemwise.py, but you also have it in .../math.py (where it should be). Same for Cast below

williambdean · 2025-06-03T11:51:57Z

Were we going to split this PR up into core functionality and op implementations? What are the next steps here?

ricardoV94 · 2025-06-03T11:59:19Z

This PR seems okay state that we can merge as is when the comments are addressed

cetagostini · 2025-06-08T20:10:23Z

@ricardoV94 I think all it's applied.

pytensor/link/mlx/dispatch/core.py

tests/link/mlx/test_basic.py

tests/link/mlx/test_elemwise.py

ricardoV94 · 2025-06-09T07:06:11Z

tests/link/mlx/test_shape.py



-@pytest.mark.xfail(reason="Reshape Op is not supported yet")
+def test_mlx_Reshape_various_shapes():


Isn't this similar to test_mlx_Reshape_concrete below? Combine with that?

I mean, doesn't seem appropriate in this case, as they're testing different scenarios of the reshape operation.

test_mlx_Reshape_various_shapes focuses on testing different dimensional transformations with static/constant shapes.

test_mlx_Reshape_concrete_shape focuses on testing computed/dynamic shapes where the shape is derived from the input tensor's properties.

Maybe they can be rename? But I feel two different things!

cetagostini · 2025-06-09T09:14:08Z

Current implementation allow to sample a simple pymc-marketing model, both gpu and cpu with MLX backend. Nevertheless complex model got issues still.

review-notebook-app · 2025-06-09T16:44:20Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

.github/workflows/test.yml

pytensor/link/mlx/dispatch/core.py

pytensor/link/mlx/dispatch/elemwise.py

tests/link/mlx/test_elemwise.py

ricardoV94 · 2025-10-08T21:25:22Z

tests/link/mlx/test_subtensor.py

+    compare_mlx_and_py([], [out_pt], [])
+
+
+@pytest.mark.xfail(reason="Inplace operations not yet supported in MLX mode")


is inplace optimization something mlx does by itself (like JAX)? In that case we don't need to worry nor test them

I could not find any explicit statement in the MLX documentation or codebase that MLX performs in-place optimization / mutation merging automatically in the same way that JAX sometimes does.

So you're supposed to write x[idx] = 0 for inplacing? That's valid MLX syntax?

Yes, that should be possible (I'll write code and double check). I'll start to make issues around MLX.

ricardoV94 · 2025-10-08T21:26:28Z

tests/link/mlx/test_subtensor.py

+mx = pytest.importorskip("mlx.core")
+
+
+def test_mlx_Subtensor_basic():


do you have any test with symbolic indices? Is that supported by MLX?

No, MLX does not currently support general symbolic indexing or boolean-masked indexing (i.e. indexing where the selection is determined by a mask) in its graph mode. It does support index arrays.

That's even more restrictive than JAX. Jax indices can be symbolic as long as they determine the output shape from the input shape. So if input indices have static shape(3,) that's fine as it knows the output will also have shape=(3,), regardless of the values

That's my understanding.

ricardoV94

I did another pass, this is looking like 99% there, left another round of comments and some change requests before

jessegrabowski · 2025-10-09T01:24:40Z

One thing I don't like about this PR is that it's basically impossible to predict where function dispatches are. Elemwise was in math.py, but CARReduce was in elemwise.py. basic.py seems to be for shared tools, but it also has FunctionGraph, CheckAndRaise, and DeepCopyOp. core.py seems to have tensor constructors?

IMO we should aim for each linker directory to follow 1:1 the pytensor.tensor module, with the dispatch for each Op in a file/module of the same name

cetagostini · 2025-10-09T11:12:29Z

@jessegrabowski resolve all your comments @ricardoV94

I reply to the other missing comments, my opinion could be if we are 99% with the PR, the other 1% can be address on another clean PR? If you tell me things we would like to modify, I can make issues and we kick-off work to improve MLX and apply those changes.

cetagostini · 2025-10-09T11:14:36Z

One thing I don't like about this PR is that it's basically impossible to predict where function dispatches are. Elemwise was in math.py, but CARReduce was in elemwise.py. basic.py seems to be for shared tools, but it also has FunctionGraph, CheckAndRaise, and DeepCopyOp. core.py seems to have tensor constructors?

IMO we should aim for each linker directory to follow 1:1 the pytensor.tensor module, with the dispatch for each Op in a file/module of the same name

Like this idea, by the way @jessegrabowski definitely for another PR? But I could do this.

pytensor/link/mlx/dispatch/elemwise.py

ricardoV94 · 2025-10-09T13:16:16Z

Let the party begin :D

williambdean marked this pull request as draft April 11, 2025 15:43

williambdean commented Apr 11, 2025

View reviewed changes

pytensor/link/pytorch/linker.py Outdated Show resolved Hide resolved

williambdean marked this pull request as ready for review April 11, 2025 17:54

ricardoV94 reviewed Apr 13, 2025

View reviewed changes

pytensor/compile/mode.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Apr 13, 2025

View reviewed changes

pytensor/link/mlx/linker.py Outdated Show resolved Hide resolved

This was referenced Apr 18, 2025

Cannot vmap mx.convolve ml-explore/mlx#2085

Closed

single channel MMMs don't work pymc-labs/pymc-marketing#1630

Open

ricardoV94 requested changes Apr 24, 2025

View reviewed changes

ricardoV94 added enhancement New feature or request backend compatibility labels Apr 24, 2025

cetagostini mentioned this pull request Apr 28, 2025

Computational overhead when transforming to a numpy array ml-explore/mlx#2132

Closed

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

pytensor/link/mlx/dispatch/blockwise.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

pytensor/link/mlx/dispatch/core.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

pytensor/link/mlx/dispatch/math.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

pytensor/link/mlx/dispatch/math.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

tests/link/mlx/test_shape.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

pytensor/link/mlx/dispatch/elemwise.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 3, 2025

View reviewed changes

ricardoV94 reviewed Jun 9, 2025

View reviewed changes