-
Notifications
You must be signed in to change notification settings - Fork 19.6k
Fixed issue with dot_product_attention when using TPU. #21254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #21254 +/- ##
==========================================
- Coverage 82.59% 82.53% -0.07%
==========================================
Files 564 564
Lines 54594 54642 +48
Branches 8483 8495 +12
==========================================
+ Hits 45092 45098 +6
- Misses 7415 7454 +39
- Partials 2087 2090 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
keras/src/backend/jax/nn.py
Outdated
strides: a sequence of `N` integers, representing the inter-window | ||
strides (default: `(1, ..., 1)`). | ||
strides (default: `(1, ..., 1)`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add indent
keras/src/backend/jax/nn.py
Outdated
"Sharding along sequence dimension not allowed in tpu kernel " | ||
"attention" | ||
"Sharding along sequence dimension not allowed" | ||
" in tpu kernel attention" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TPU
keras/src/backend/jax/nn.py
Outdated
|
||
Args: | ||
query: Queries with shape `[batch, time, heads, | ||
depth_k]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use 4-space indent
Corrected indentation in doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
…-team#21254)" This reverts commit d8f3f70.
…s-team#21254)" (keras-team#21329) This reverts commit 81821e0.
…after addressing cuDNN/FlashAttention API updates (#21333) * Update nn.py * Update nn.py * Update nn.py * Update nn.py * Update nn.py Corrected indentation in doc string * Update nn.py * Update random_grayscale.py Fixed issue with passing a single image without batch dimension. * Update keras/src/layers/preprocessing/image_preprocessing/random_grayscale.py Co-authored-by: Jyotinder Singh <[email protected]> * Update random_grayscale_test.py Test case for unbatched inputs * code reformat * Update random_grayscale_test.py Testcase for checking both unbatched and batched single image inputs. * changed compute_output_spec There was a bug, and it was causing cycle in graph. * Update random_grayscale.py removed the use of tree.map_structure * Reapply "Fixed issue with dot_product_attention when using TPU. (#21254)" (#21329) This reverts commit 81821e0. * Improve error handling in _can_use_flash_attention for better debugging Enhanced the _can_use_flash_attention function to provide more detailed error messages when flash attention compatibility checks fail. Changes: - Replace generic exception catching with specific error propagation - When raise_error=True, directly re-raise original exceptions from check_layout() and check_is_flash_attention() functions - Preserve detailed error context from JAX internal validation functions - Maintain existing behavior when raise_error=False (returns False) This improves debugging experience by surfacing specific technical details about tensor layout incompatibilities, cuDNN version requirements, and other flash attention compatibility issues. Relates to keras-hub PR #2257 and addresses flash attention debugging needs. * Revert "Improve error handling in _can_use_flash_attention for better debugging" This reverts commit 7a0c547. * Fix JAX API compatibility and improve error handling in `_can_use_flash_attention` Changes: - Add missing q_offsets=None and kv_offsets=None parameters to check_layout() call to match updated JAX function signature - Replace bare `except:` with `except Exception as e:` and `raise e` to preserve detailed error messages from JAX validation functions - Maintain existing fallback behavior when raise_error=False This resolves compatibility issues with newer JAX versions and improves debugging experience by surfacing specific technical details about flash attention compatibility failures. * Updated `dot_product_attention` Simplified the check for `flasth_attention` by removing redundant checks that are already done in `_can_use_flash_attention`. * Update nn.py * Update nn.py --------- Co-authored-by: Jyotinder Singh <[email protected]>
* Update nn.py * Update nn.py * Update nn.py * Update nn.py * Update nn.py Corrected indentation in doc string * Update nn.py * Update random_grayscale.py Fixed issue with passing a single image without batch dimension. * Update keras/src/layers/preprocessing/image_preprocessing/random_grayscale.py Co-authored-by: Jyotinder Singh <[email protected]> * Update random_grayscale_test.py Test case for unbatched inputs * code reformat * Update random_grayscale_test.py Testcase for checking both unbatched and batched single image inputs. * changed compute_output_spec There was a bug, and it was causing cycle in graph. * Update random_grayscale.py removed the use of tree.map_structure * Reapply "Fixed issue with dot_product_attention when using TPU. (#21254)" (#21329) This reverts commit 81821e0. * Improve error handling in _can_use_flash_attention for better debugging Enhanced the _can_use_flash_attention function to provide more detailed error messages when flash attention compatibility checks fail. Changes: - Replace generic exception catching with specific error propagation - When raise_error=True, directly re-raise original exceptions from check_layout() and check_is_flash_attention() functions - Preserve detailed error context from JAX internal validation functions - Maintain existing behavior when raise_error=False (returns False) This improves debugging experience by surfacing specific technical details about tensor layout incompatibilities, cuDNN version requirements, and other flash attention compatibility issues. Relates to keras-hub PR #2257 and addresses flash attention debugging needs. * Revert "Improve error handling in _can_use_flash_attention for better debugging" This reverts commit 7a0c547. * Fix JAX API compatibility and improve error handling in `_can_use_flash_attention` Changes: - Add missing q_offsets=None and kv_offsets=None parameters to check_layout() call to match updated JAX function signature - Replace bare `except:` with `except Exception as e:` and `raise e` to preserve detailed error messages from JAX validation functions - Maintain existing fallback behavior when raise_error=False This resolves compatibility issues with newer JAX versions and improves debugging experience by surfacing specific technical details about flash attention compatibility failures. * Updated `dot_product_attention` Simplified the check for `flasth_attention` by removing redundant checks that are already done in `_can_use_flash_attention`. * Update nn.py * Update nn.py * Update image.py * Update keras/src/backend/tensorflow/image.py Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Revert "Update keras/src/backend/tensorflow/image.py" This reverts commit cb7e955. * Update image.py * Update image.py --------- Co-authored-by: Jyotinder Singh <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
No description provided.