better check for mxfp8 cuda kernel presence #2933
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Short term fix for #2932. If torchao was built without CUDA 10.0 (such as in our CI), ensures that:
a. only callsites which actually use the mxfp8 dim1 kernel see the error message. Using NVFP4 no longer hits this error.
b. make the error message point to github issue for more info on the workaround (for now, build from souce).
Test Plan:
ao/setup.py
Line 641 in 8555713
torchao/prototype
does not have any.so
filespytest test/prototype/mx_formats/test_nvfp4_tensor.py -s -x
pytest test/prototype/mx_formats/test_mx_linear.py -s -x -k test_linear_eager_vs_hp
pytest test/prototype/mx_formats/ -s -x
Reviewers:
Subscribers:
Tasks:
Tags: