Fix jax custom gradle (#21105) #21877

Mayankvlog · 2025-11-29T07:35:37Z

This PR addresses the failure in JAX when using @custom_grad.
It isolates only the gradient logic fix so it can be reviewed independently.

…eam#21647)

- torch-xla is not available for Windows platform - Manually installed tensorflow-cpu, torch, jax, and flax - Fixed protobuf version conflicts (downgraded to <6.0.0) - Tests now run successfully without ModuleNotFoundError

…ng errors - Fixed custom_gradient in JAX backend to extract Variable values automatically - Improved code structure by moving helper function outside wrapper - Fixed EfficientNetV2B2 import to use direct module import - Fixed all Ruff linting errors (line length, unused imports/variables) - Tests now pass without requiring manual .value access on Variables

- Changed input size from 64x64 to 224x224 (minimum supported by EfficientNetV2) - Fixed EfficientNetV2B0 import to use direct module path - Resolves ValueError: Input size must be at least 32x32 - Resolves ImportError for EfficientNetV2B0

…input_shape validation This commit addresses three issues that were causing CI failures: 1. Fixed JAX Backend custom_gradient with Variables (Issue keras-team#21105) - Problem: Variables passed to custom_gradient in JAX backend caused 'TypeError: NoneType object is not callable' - Root cause: JAX copies Variables during tracing, causing both _value and _initializer to become None - Solution: * Modified custom_gradient wrapper to properly convert Variables to values * Added fallback in __jax_array__ to handle uninitialized Variables - Added test: test_custom_gradient_with_variable in keras/src/ops/core_test.py 2. Fixed obtain_input_shape validation for channels_first format - Problem: Confusing error when users provide input_shape in wrong format (e.g., (224,224,3) when (3,224,224) expected for channels_first) - Solution: Added validation to detect format mismatch with clear error message - Updated efficientnet_v2_jit_test.py to use correct channels_first format 3. Code format fixes - Fixed line length violations - Fixed import ordering - Removed unused imports Files modified: - keras/src/backend/jax/core.py - keras/src/ops/core_test.py - keras/src/applications/imagenet_utils.py - keras/src/applications/efficientnet_v2_jit_test.py - test_custom_gradient_jax_variable.py All tests passing with JAX backend.

… add test

- Changed get_shapes_dict to only exclude 'mask' parameter, not all *_mask - Allows custom layers to use parameters like attention_mask, padding_mask - Added comprehensive tests for _mask parameter handling - Maintains backward compatibility with Keras masking Fixes keras-team#21154

gemini-code-assist · 2025-11-29T07:36:02Z

Summary of Changes

Hello @Mayankvlog, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request primarily addresses a critical bug in the JAX backend related to custom_gradient and Keras Variable objects, ensuring seamless gradient computation. It also expands Keras's capabilities by introducing a new perceptual loss function (LPIPS), refines layer input handling, and improves input shape validation for image models. Additionally, it provides important documentation regarding JIT compilation limitations in the Torch backend.

Highlights

JAX Custom Gradient Fix: Resolved a TypeError in the JAX backend where custom_gradient functions failed when Keras Variable objects were passed as arguments. The fix automatically extracts the underlying tensor value from Variable objects before passing them to JAX's custom_gradient.
JAX Variable Tracing Improvement: Enhanced the __jax_array__ method for Keras Variable objects in the JAX backend to correctly handle cases where _value and _initializer become None during JAX tracing, preventing potential errors.
LPIPS Loss Implementation: Introduced a new backend-agnostic LPIPS (Learned Perceptual Image Patch Similarity) loss function, which uses a VGG16-based feature extractor to compute perceptual distances between images.
Layer Mask Parameter Handling: Refined the logic for handling mask parameters in Keras layers, ensuring that only the mask parameter (and not all parameters ending with _mask) is excluded from the shapes_dict during compute_output_shape calculations.
Input Shape Validation for Image Applications: Improved input shape validation in imagenet_utils.py to detect and raise a ValueError when channels_last formatted input shapes are accidentally provided to models configured for channels_first.
Torch Backend JIT Compile Limitations Documentation: Added documentation outlining known limitations and workarounds for jit_compile=True with the Torch backend, particularly concerning EfficientNetV2 models and torch.compile's interaction with tree operations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly addresses the issue with @custom_grad in the JAX backend. The fix in keras/src/backend/jax/core.py is clean and the accompanying test is well-written.

However, this PR includes many unrelated changes, which contradicts the description's claim that it 'isolates only the gradient logic fix'. Specifically, it introduces a new LPIPS loss, fixes an issue with layer mask argument handling, improves input validation in imagenet_utils, and adds tests for unrelated issues.

Most critically, this PR deletes the root .gitignore and README.md files, which seems to be a mistake.

I strongly recommend splitting this PR into several smaller, focused pull requests. This will make each change easier to review and merge. For this review, I have focused on the core JAX fix and a few other significant issues I noticed.

hertschuh

@Mayankvlog

Thank you for the PR!

This appears to be combining 4 unrelated things together:

the fix for #21105 (JAX custom gradient)
the fix for #21647 (EfficientNetV2 on torch)
the fix for #21154 (mask layer parameters)
adding the LPIP loss

Please split these into 4 separate PRs. Thank you.

Mayankvlog and others added 16 commits October 25, 2025 17:41

Document torch.compile limitations with EfficientNetV2 (Issue keras-t…

564e155

…eam#21647)

fix: resolve TensorFlow import error on Windows

0a4ec49

- torch-xla is not available for Windows platform - Manually installed tensorflow-cpu, torch, jax, and flax - Fixed protobuf version conflicts (downgraded to <6.0.0) - Tests now run successfully without ModuleNotFoundError

Merge branch 'master' into improve-docs-loss-functions

8f86aef

Fix test: use proper initializer object instead of string

9b439c4

feat: add LPIPS perceptual loss

4b15f7d

Merge branch 'keras-team:master' into improve-docs-loss-functions

22a3bf1

fix(tf-remat): avoid passing kwargs to custom_gradient in graph mode;…

5c925c0

… add test

Fix code formatting: line length under 80 chars

ae4348c

chore: trigger CI re-run

99b8b64

Merge branch 'master' into fix-mask-parameter-issue-21154

fcb9b85

fix: resolve all line length formatting issues (E501)

27c0daa

chore: trigger CI re-run for format check

3d79a4d

google-ml-butler bot added the size:XL label Nov 29, 2025

google-ml-butler bot assigned gbaned Nov 29, 2025

gemini-code-assist bot reviewed Nov 29, 2025

View reviewed changes

hertschuh requested changes Dec 1, 2025

View reviewed changes

hertschuh added the stat:awaiting response from contributor label Dec 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix jax custom gradle (#21105) #21877

Fix jax custom gradle (#21105) #21877

Mayankvlog commented Nov 29, 2025

Uh oh!

gemini-code-assist bot commented Nov 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

hertschuh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix jax custom gradle (#21105) #21877

Are you sure you want to change the base?

Fix jax custom gradle (#21105) #21877

Conversation

Mayankvlog commented Nov 29, 2025

Uh oh!

gemini-code-assist bot commented Nov 29, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

hertschuh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants