-
Notifications
You must be signed in to change notification settings - Fork 234
Add CUDA version compatibility check #1412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
/ok to test 7ce325c |
7ce325c to
1962e35
Compare
|
/ok to test 1962e35 |
|
| f"but the installed driver only supports CUDA {runtime_major}.{runtime_minor}. " | ||
| f"Some features may not work correctly. Consider updating your NVIDIA driver. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is nerve wrecking considering that we support CUDA Minor Version Compatibility. In a few cases, we check the minor version in cuda-core to ensure what we need is available at run time, but only when we actually need them. I think we should rephrase this warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning is only issued in the case of a MAJOR version mismatch.
8a0d248 to
73e5a43
Compare
|
/ok to test 73e5a43 |
Warn when cuda-bindings was compiled against a newer CUDA major version than the installed driver supports. This helps users understand why certain features may not work correctly. The check runs once after cuInit and can be suppressed via CUDA_PYTHON_DISABLE_VERSION_CHECK=1.
73e5a43 to
ddd92bd
Compare
|
/ok to test ddd92bd |
| # Get runtime driver version | ||
| err, runtime_version = driver.cuDriverGetVersion() | ||
| if err != driver.CUresult.CUDA_SUCCESS: | ||
| return # Can't check, skip silently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not being able to query from the driver version is worthy of a warning to the user instead of silently eating it.
| ) | ||
|
|
||
|
|
||
| def _reset_version_compatibility_check(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a real use case where production code would call this function? If it’s only used by tests, then it seems reasonable to rely implicitly on the global and move the function into the test code instead.
rparolin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should issue a warning to the user if we can't fetch the driver version.
Summary
Adds a version compatibility check that warns users when cuda-bindings was compiled against a newer CUDA major version than the installed driver supports.
Changes
cuda-bindings
check_cuda_version_compatibility()function incuda/bindings/utils/_version_check.pyCUDA_VERSIONvs runtimecuDriverGetVersion()cuda.bindings.utilstests/test_version_check.pycuda-core
Device.__new__callscheck_cuda_version_compatibility()aftercuInitsucceedscuda.bindings.utilsRationale
When cuda-bindings is built against CUDA 13 headers but the user's driver only supports CUDA 12, many features will silently fail or behave unexpectedly. This check provides early, clear feedback:
Design
cuda.bindings.utilssince it checks cuda-bindings' compile-time versionDevicefirst triggers CUDA initializationCUDA_PYTHON_DISABLE_VERSION_CHECK=1to disableFuture Work
We could not find a suitable place to invoke the version check automatically within cuda-bindings itself (e.g., hooking into
cuInit), so the check is currently triggered by cuda-core. This may be revisited in the future.Test Coverage
7 tests in cuda-bindings covering:
Related Work