https://github.com/halide/Halide
Revision f47c5c99deac86c6d1f16cfcb1743a0e9e79317d authored by Zalman Stern on 02 December 2020, 22:40:21 UTC, committed by GitHub on 02 December 2020, 22:40:21 UTC
This PR adds a consistent GPU compiled kernel cache across the
Cuda, Direct3D, OpenCL, and Metal runtimes. This cache is robust
for kernels being used across multiple contexts and threads as well
as using common code via a template. OpenGL and OpenGLCompute
are not addressed due to issues in their implementation. There
should be no regressions for those runtimes however.

Adds tests for many GPU kernels and kernels across contexts and
threads.

Fixes a bug in CUDA runtime where some error message text in
cuda_do_multidimensional_copy was not initialized.

Fixes a bug in CUDA runtime where device release code did not run if
CUDA libraries are directly linked into the executable. (This would
have caused crashes due to the device allocation caching among other
issues.)

1 parent 073b8e4
History
Tip revision: f47c5c99deac86c6d1f16cfcb1743a0e9e79317d authored by Zalman Stern on 02 December 2020, 22:40:21 UTC
Make context handling in GPU runtimes more consistent and robust. (#5474)
Tip revision: f47c5c9
File Mode Size
.github
apps
cmake
dependencies
doc
packaging
python_bindings
src
test
tools
tutorial
util
.clang-format -rw-r--r-- 1.5 KB
.clang-format-ignore -rw-r--r-- 265 bytes
.clang-tidy -rw-r--r-- 1.8 KB
.gitattributes -rw-r--r-- 342 bytes
.gitignore -rw-r--r-- 1.1 KB
.gitmodules -rw-r--r-- 0 bytes
CMakeLists.txt -rw-r--r-- 4.2 KB
CMakePresets.json -rw-r--r-- 2.4 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.5 KB
LICENSE.txt -rw-r--r-- 3.2 KB
Makefile -rw-r--r-- 96.5 KB
README.md -rw-r--r-- 20.6 KB
README_cmake.md -rw-r--r-- 68.6 KB
README_rungen.md -rw-r--r-- 12.1 KB
README_webassembly.md -rw-r--r-- 7.5 KB
run-clang-format.sh -rwxr-xr-x 1.1 KB
run-clang-tidy.sh -rwxr-xr-x 2.8 KB

README.md

back to top