Revert "Fixed issue with dot_product_attention when using TPU. " #21329

pctablet505 · 2025-05-27T17:52:54Z

…-team#21254)" This reverts commit d8f3f70.

codecov-commenter · 2025-05-27T17:58:55Z

Codecov Report

Attention: Patch coverage is 15.38462% with 11 lines in your changes missing coverage. Please review.

Project coverage is 82.64%. Comparing base (240d4f4) to head (36b5dc3).

Files with missing lines	Patch %	Lines
keras/src/backend/jax/nn.py	15.38%	9 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #21329      +/-   ##
==========================================
+ Coverage   82.58%   82.64%   +0.06%     
==========================================
  Files         565      565              
  Lines       54819    54771      -48     
  Branches     8516     8504      -12     
==========================================
- Hits        45273    45267       -6     
+ Misses       7454     7415      -39     
+ Partials     2092     2089       -3

Flag	Coverage Δ
keras	`82.45% <15.38%> (+0.06%)`	⬆️
keras-jax	`63.58% <15.38%> (+0.04%)`	⬆️
keras-numpy	`58.74% <0.00%> (+0.04%)`	⬆️
keras-openvino	`33.15% <0.00%> (+0.02%)`	⬆️
keras-tensorflow	`63.99% <0.00%> (+0.05%)`	⬆️
keras-torch	`63.65% <0.00%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fchollet

What's the issue?

…s-team#21254)" (keras-team#21329) This reverts commit 81821e0.

…after addressing cuDNN/FlashAttention API updates (#21333) * Update nn.py * Update nn.py * Update nn.py * Update nn.py * Update nn.py Corrected indentation in doc string * Update nn.py * Update random_grayscale.py Fixed issue with passing a single image without batch dimension. * Update keras/src/layers/preprocessing/image_preprocessing/random_grayscale.py Co-authored-by: Jyotinder Singh <[email protected]> * Update random_grayscale_test.py Test case for unbatched inputs * code reformat * Update random_grayscale_test.py Testcase for checking both unbatched and batched single image inputs. * changed compute_output_spec There was a bug, and it was causing cycle in graph. * Update random_grayscale.py removed the use of tree.map_structure * Reapply "Fixed issue with dot_product_attention when using TPU. (#21254)" (#21329) This reverts commit 81821e0. * Improve error handling in _can_use_flash_attention for better debugging Enhanced the _can_use_flash_attention function to provide more detailed error messages when flash attention compatibility checks fail. Changes: - Replace generic exception catching with specific error propagation - When raise_error=True, directly re-raise original exceptions from check_layout() and check_is_flash_attention() functions - Preserve detailed error context from JAX internal validation functions - Maintain existing behavior when raise_error=False (returns False) This improves debugging experience by surfacing specific technical details about tensor layout incompatibilities, cuDNN version requirements, and other flash attention compatibility issues. Relates to keras-hub PR #2257 and addresses flash attention debugging needs. * Revert "Improve error handling in _can_use_flash_attention for better debugging" This reverts commit 7a0c547. * Fix JAX API compatibility and improve error handling in `_can_use_flash_attention` Changes: - Add missing q_offsets=None and kv_offsets=None parameters to check_layout() call to match updated JAX function signature - Replace bare `except:` with `except Exception as e:` and `raise e` to preserve detailed error messages from JAX validation functions - Maintain existing fallback behavior when raise_error=False This resolves compatibility issues with newer JAX versions and improves debugging experience by surfacing specific technical details about flash attention compatibility failures. * Updated `dot_product_attention` Simplified the check for `flasth_attention` by removing redundant checks that are already done in `_can_use_flash_attention`. * Update nn.py * Update nn.py --------- Co-authored-by: Jyotinder Singh <[email protected]>

* Update nn.py * Update nn.py * Update nn.py * Update nn.py * Update nn.py Corrected indentation in doc string * Update nn.py * Update random_grayscale.py Fixed issue with passing a single image without batch dimension. * Update keras/src/layers/preprocessing/image_preprocessing/random_grayscale.py Co-authored-by: Jyotinder Singh <[email protected]> * Update random_grayscale_test.py Test case for unbatched inputs * code reformat * Update random_grayscale_test.py Testcase for checking both unbatched and batched single image inputs. * changed compute_output_spec There was a bug, and it was causing cycle in graph. * Update random_grayscale.py removed the use of tree.map_structure * Reapply "Fixed issue with dot_product_attention when using TPU. (#21254)" (#21329) This reverts commit 81821e0. * Improve error handling in _can_use_flash_attention for better debugging Enhanced the _can_use_flash_attention function to provide more detailed error messages when flash attention compatibility checks fail. Changes: - Replace generic exception catching with specific error propagation - When raise_error=True, directly re-raise original exceptions from check_layout() and check_is_flash_attention() functions - Preserve detailed error context from JAX internal validation functions - Maintain existing behavior when raise_error=False (returns False) This improves debugging experience by surfacing specific technical details about tensor layout incompatibilities, cuDNN version requirements, and other flash attention compatibility issues. Relates to keras-hub PR #2257 and addresses flash attention debugging needs. * Revert "Improve error handling in _can_use_flash_attention for better debugging" This reverts commit 7a0c547. * Fix JAX API compatibility and improve error handling in `_can_use_flash_attention` Changes: - Add missing q_offsets=None and kv_offsets=None parameters to check_layout() call to match updated JAX function signature - Replace bare `except:` with `except Exception as e:` and `raise e` to preserve detailed error messages from JAX validation functions - Maintain existing fallback behavior when raise_error=False This resolves compatibility issues with newer JAX versions and improves debugging experience by surfacing specific technical details about flash attention compatibility failures. * Updated `dot_product_attention` Simplified the check for `flasth_attention` by removing redundant checks that are already done in `_can_use_flash_attention`. * Update nn.py * Update nn.py * Update image.py * Update keras/src/backend/tensorflow/image.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Revert "Update keras/src/backend/tensorflow/image.py" This reverts commit cb7e955. * Update image.py * Update image.py --------- Co-authored-by: Jyotinder Singh <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Revert "Fixed issue with dot_product_attention when using TPU. (keras…

36b5dc3

…-team#21254)" This reverts commit d8f3f70.

google-ml-butler bot added the size:M label May 27, 2025

google-ml-butler bot assigned gbaned May 27, 2025

pctablet505 requested a review from divyashreepathihalli May 27, 2025 17:53

google-ml-butler bot added the awaiting review label May 27, 2025

divyashreepathihalli approved these changes May 27, 2025

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels May 27, 2025

kokoro-team removed the kokoro:force-run label May 27, 2025

fchollet approved these changes May 27, 2025

View reviewed changes

google-ml-butler bot added the kokoro:force-run label May 27, 2025

kokoro-team removed the kokoro:force-run label May 27, 2025

fchollet merged commit 81821e0 into keras-team:master May 27, 2025
11 checks passed

google-ml-butler bot removed awaiting review ready to pull Ready to be merged into the codebase labels May 27, 2025

pctablet505 added a commit to pctablet505/keras that referenced this pull request May 29, 2025

Reapply "Fixed issue with dot_product_attention when using TPU. (kera…

579cc11

…s-team#21254)" (keras-team#21329) This reverts commit 81821e0.

pctablet505 mentioned this pull request May 29, 2025

Re-apply Fixed issue with dot_product_attention when using TPU. #21254 after addressing cuDNN/FlashAttention API updates #21333

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "Fixed issue with dot_product_attention when using TPU. " #21329

Revert "Fixed issue with dot_product_attention when using TPU. " #21329

Uh oh!

pctablet505 commented May 27, 2025

Uh oh!

codecov-commenter commented May 27, 2025 •

edited

Loading

Uh oh!

fchollet left a comment

Uh oh!

Uh oh!

Uh oh!

Revert "Fixed issue with dot_product_attention when using TPU. " #21329

Revert "Fixed issue with dot_product_attention when using TPU. " #21329

Uh oh!

Conversation

pctablet505 commented May 27, 2025

Uh oh!

codecov-commenter commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented May 27, 2025 •

edited

Loading