-
Notifications
You must be signed in to change notification settings - Fork 273
Add env based api for DeviceScan::ExclusiveSum/Scan #5767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add env based api for DeviceScan::ExclusiveSum/Scan #5767
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📖 Doc Preview CI🚀 Preview URL: https://NVIDIA.github.io/cccl/pr-preview/pr-5767/ Preview will be available once GitHub Pages deployment completes. |
This comment has been minimized.
This comment has been minimized.
I guess this PR fixes this #5606? |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
// Equivalent to `cuexec::require(cuexec::determinism::run_to_run)` and | ||
// `cuexec::require(cuexec::determinism::not_guaranteed)` | ||
auto env = stdexec::env{cuda::execution::require(determinism_t{}), // determinism | ||
allowed_kernels(kernels), // allowed kernels for the given determinism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could not be dangerous with multiple translation units?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fbusato could you please expand on it? I don't fully understand what the problem might be.
@srinivasyadav18 Is not_guaranteed
in the set of deterministic requirements supported by ExlusiveScan
? I don't see it anywhere in the header source and/or the underlying implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
potential problem (I could be wrong): kernel functions are passed with pointers. Translation units could have difference pointer values for the same instantiation.
This comment has been minimized.
This comment has been minimized.
- add [[no_discard]] attribute to New DeviceScan env based overload's - reject not_guaranteed determinism with static_assert, as its not implemented - store and check error in tests - remove ifconstexpr in kernel tests
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 6h 32m: Pass: 100%/185 | Total: 6d 03h | Max: 3h 38m | Hits: 76%/186975See results here. |
Description
closes #5657
TODO
device_memory_resource
tocuda/std/__device_memory_resource.h