https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
d0249a6 Merge branch 'abadams/fix_round' into srj/x-rounding 22 September 2022, 19:07:37 UTC
3e7851f Merge branch 'main' into xtensa-codegen 22 September 2022, 19:02:57 UTC
7286ec3 Fix PyExt error handling (#7042) The current PythonExtensionGen code attempts to provide verbose error (exception) messages by overriding halide_error and saving the message in a thread_local. This isn't safe or correct, however, and (in general) is wrong for any Halide code using multiple threads. #6994 proposes ways to mitigate this (and there are experiments in place to implement it), but unless/until those enhancements land, we can't leave the code in its current state. So: - Don't try to save the text at all. - Optionally log the text to stderr. - Just throw an exception with the numeric error code. This is suboptimal, but better than the existing usually-incorrect-message behavior. - Bonus: wrap both the error and print overloads with `PyGILState_Ensure()`, as we are supposed to, to ensure we don't die. 22 September 2022, 17:07:32 UTC
8d930a0 d3d12 doesn't like double input/output buffers 22 September 2022, 01:54:13 UTC
b4b27b2 Revert "Temporarily disable testing for apps/fft (#7033)" (#7040) Revert "Temporarily disable testing for apps/fft (#7033) (#7035)" This reverts commit 48d56d8066f322e016d60d486613837e3670dd00. 21 September 2022, 23:20:06 UTC
7ef8a53 Merge branch 'main' into xtensa-codegen 21 September 2022, 22:28:48 UTC
33f6a0f Handle widen_right_* intrinsics in bounds inference (#7039) 21 September 2022, 22:01:10 UTC
7ec2a4e Add stack-size-canary test to apps/fft's CMake file (#7034) * Add stack-size-canary test to apps/fft's CMake file This was apparently meant as a canary for stack size usage, but the necessary setting only happened in the Makefile, not the CMake file. Also, drive-by fix in Makefile to ignore warnings about `-ObjC++` being ignored, which apparently can be the case with current AppleClang configs. * Update CMakeLists.txt 21 September 2022, 17:20:03 UTC
3e3944c Merge branch 'abadams/fix_round' of https://github.com/halide/Halide into abadams/fix_round 21 September 2022, 17:11:33 UTC
c426703 revert change to mangling 21 September 2022, 17:09:15 UTC
464e47b Merge branch 'main' into abadams/fix_round 21 September 2022, 01:21:04 UTC
0d02a0b Fix Wasm BulkMemory Codgen + Minor fixes to apps/HelloWasm (#7026) * Minor fixes to apps/HelloWasm * trigger buildbots * Fix Codegen * Update CMakeLists.txt 21 September 2022, 00:55:56 UTC
7351070 Codegen_C for user_context (#7031) Codegen_C fixes 20 September 2022, 22:52:26 UTC
ba53b93 Add reinterpret simplifications (#7029) Co-authored-by: Steven Johnson <srj@google.com> 20 September 2022, 22:00:14 UTC
c353b00 Merge branch 'main' into abadams/fix_round 20 September 2022, 20:23:07 UTC
48d56d8 Temporarily disable testing for apps/fft (#7033) (#7035) Want to avoid reporting this known bug while fix is investigated 20 September 2022, 19:08:29 UTC
a542e07 Merge branch 'main' into abadams/fix_round 19 September 2022, 23:32:22 UTC
36d601f Don't use `-g` for EMCC (#7025) * Don't use `-g` for EMCC Combining `-g` with other default EMCC flags will now emit warning/error messages regarding binaryen optimization, so, don't use that flag in the default settings. * Update Error.cpp 19 September 2022, 23:18:19 UTC
de33792 Use nearbyint for wasm instead of rint 19 September 2022, 21:03:56 UTC
48a78bd Add vector versions of rint for wasm 19 September 2022, 19:45:57 UTC
469b0da Add missing return 19 September 2022, 17:52:42 UTC
e35c1a7 More parens 19 September 2022, 16:29:58 UTC
fdc4759 metal doesn't support doubles 19 September 2022, 16:28:32 UTC
8429b17 scatter, undef, and require aren't pure 19 September 2022, 01:54:34 UTC
3907e98 Handle PureIntrinsics of const args in bounds 18 September 2022, 22:05:09 UTC
aa29466 Teach the mullapudi cost model about round 18 September 2022, 21:42:58 UTC
b6b4ced Bounds of Call::round 18 September 2022, 21:08:23 UTC
063a208 d3d12 fix 18 September 2022, 21:08:08 UTC
8cf0eb8 Constant-fold round in simplifier 18 September 2022, 00:50:19 UTC
7e0963f Make round an intrinsic 17 September 2022, 23:23:45 UTC
3a5941d Appease Python linter (#7022) Apparently it's "more Pythonic" to use "not in" vs "not ... in", etc. 16 September 2022, 23:38:19 UTC
6d6de97 Merge branch 'main' into xtensa-codegen 16 September 2022, 21:06:39 UTC
ef10a42 Define a Generator framework in Python (#6764) * Define a Generator framework in Python 16 September 2022, 21:06:17 UTC
c615490 Use rint on metal for Halide::round 16 September 2022, 20:58:48 UTC
7da057b Add vectorizable lowering for round on platforms without roundeven 16 September 2022, 20:57:43 UTC
e31746d wasm doesn't support float16 15 September 2022, 19:33:22 UTC
5ff15bb Fix SpecificExpr canonicalization (#7016) fix SpecificExpr canonicalization 15 September 2022, 01:54:29 UTC
8f0e387 Don't try to emit roundeven on wasm 14 September 2022, 23:53:06 UTC
e429084 Work around hexagon issue 14 September 2022, 23:43:49 UTC
c7e9ab1 Merge branch 'main' into xtensa-codegen 14 September 2022, 21:55:45 UTC
18b06f0 Revert "[HVX] Simplify constant factor before distributing" (#7013) Revert "[HVX] Simplify constant factor before distributing (#7009)" This reverts commit 69b50af793d6eb850eea3dccb16426479edbff9d. 14 September 2022, 20:49:20 UTC
5d8e7d7 Don't test opencl with doubles if CLDoubles is not enabled 14 September 2022, 16:54:32 UTC
cb436b8 Merge branch 'main' into xtensa-codegen 14 September 2022, 16:54:17 UTC
ace8028 Add minimum GitHub token permissions for workflow (#7011) Signed-off-by: Varun Sharma <varunsh@stepsecurity.io> Signed-off-by: Varun Sharma <varunsh@stepsecurity.io> 14 September 2022, 14:04:29 UTC
4195d5f Fix rounding in opencl 14 September 2022, 02:48:29 UTC
655211e Rework Python Extension C++ code (again) (#7010) * Rework Python Extension C++ code (again) My previous effort was too clever for itself: while it worked for Halide's build systems, some other build systems (e.g. Blaze) are much more finicky about the C++ files you build Python extensions from, and re-using the same C++ files with different preprocessor settings turns out to be too problematic there, for reasons that aren't important here. Anyway, the important part here was to rework so that (1) All the C++ source files needed are compiled exactly once (2) All the C++ source files needed can be compiled with the same set of preprocessor definitions To that end, I have extended GenGen's `-r` flag to allow using `-e python_extension`; this emits the bare module-registration code by itself. So now, we generate the Python Extension code as before, but define HALIDE_PYTHON_EXTENSION_OMIT_MODULE_DEFINITION to defeat the standalone module registration for each one, then also compile in the new 'standalone' registration, with HALIDE_PYTHON_EXTENSION_MODULE and HALIDE_PYTHON_EXTENSION_FUNCTIONS defined to fill in the blanks. Also, a little drive-by cleanup in CodeGen_C to make extern "C" blocks more findable, and some restructuring in PyExtGen. * Update user_context_generator.cpp * Dummy source file * IF NOT EXISTS before file(WRITE) 14 September 2022, 02:05:02 UTC
083e651 Add missing include to C output 13 September 2022, 23:51:06 UTC
e6e2d84 the nvidia libdevice is buggy for doubles See https://reviews.llvm.org/D85236 13 September 2022, 23:45:18 UTC
b3b3685 round to even on win32 13 September 2022, 23:11:52 UTC
053002a Use rint on ptx, which is documented to round to even 13 September 2022, 23:11:46 UTC
6f3b7d4 Explicitly set the rounding mode in the C backend 13 September 2022, 23:11:37 UTC
41663c3 Make Halide::round round to even as documented 13 September 2022, 23:03:34 UTC
23b54ae Improve comment on Halide::round 13 September 2022, 23:03:16 UTC
749cddd Clean up some pointless code 13 September 2022, 23:03:06 UTC
0642d2d Fix mismatched name 13 September 2022, 20:06:05 UTC
d3b95e2 Handle one of the widen_right_mul intrinsics 13 September 2022, 18:43:59 UTC
32f4a15 Merge branch 'main' into xtensa-codegen 13 September 2022, 18:29:31 UTC
27b8a7d Add one-sided widening intrinsics. (#6967) * implement widen_right_ ops * update HVX patterns with one-sided widening intrinsics * remove unused HVX pattern flags * strengthen logic for finding rounding shifts Co-authored-by: Steven Johnson <srj@google.com> 13 September 2022, 18:14:42 UTC
69b50af [HVX] Simplify constant factor before distributing (#7009) * simplify constant factor before distributing * add simd_op_check test 12 September 2022, 23:51:33 UTC
797b5f1 Merge branch 'main' into xtensa-codegen 12 September 2022, 22:31:39 UTC
ff47ab0 Fix some bugs in div_round_to_zero (#7008) * Fix some bugs in div_round_to_zero ... and fast_integer_divide_round_to_zero These were never adequately tested, and there were a few issues. * Add missing print 12 September 2022, 16:25:26 UTC
a4f86de Fix Python handling of boolean buffers (#7006) The Python Extension code didn't handle boolean buffers correctly, making it impossible to construct one in Python and pass it thru to Halide-generated code. This fixes that, and also fixes the test that just expected it to fail (!). 11 September 2022, 23:57:33 UTC
c98f193 Couple small fixes to update RISC V to current LLVM flags and enable vscale use. (#6995) Couple small fixes to update RISC V to current LLVM flags and enable vscale use. Co-authored-by: Steven Johnson <srj@google.com> 10 September 2022, 16:02:48 UTC
a0a1d09 Prohibit C99 VLA usage in runtime code (#7005) * Prohibit C99 VLA usage in runtime code AFAICT we aren't doing this in Halide at present, but some experimental code in Google runtime was doing so; this caused some issues with some experimental Clang patches, but also was never really intended to be used in the first place. Adding the flag here to be sure no unintended use creeps back in. While I was there, took the time to ensure that the flags for runtime are unified across CMake and Make. * oops 10 September 2022, 00:07:12 UTC
4e352d3 Clean up Adams2019 CMake file (#7003) 09 September 2022, 16:49:26 UTC
bd74f94 Log target info in performance_fast_pow (#6997) (#6998) Try to gather info to track down heisenbug 08 September 2022, 00:40:05 UTC
e5069ef Apply _Halide_place_dll() to _Halide_gengen (#6999) (#7000) 07 September 2022, 21:15:58 UTC
1644e64 [Codegen] Adapt ModuleAddressSanitizerPass/ModuleSanitizerCoveragePass renaming (#6996) https://github.com/llvm/llvm-project/commit/93600eb50ceeec83c488ded24fa0fd25f997fec6 renamed ModuleAddressSanitizerPass to AddressSanitizerPass. https://github.com/llvm/llvm-project/commit/4c18670776cd6ac31099a455b2b22b38b0408006 renamed ModuleSanitizerCoveragePass. 07 September 2022, 18:28:08 UTC
cbe2e63 Fix compiler warnings in Elf.cpp (#6992) * Fix compiler warnings in Elf.cpp Some versions of GCC will complain that there is a possible use of uninitialized field `Sym<>::st_info` here; that's technically true, in that it is a bitfield that we previously set via two calls, so it temporarily could use uninitialized bits, but those would immediately be overwritten by well-defined bits. That said, the API could have been misused, so I collapsed Sym::set_type and Sym::set_bindings into a single call to avoid this warning. While I was there, I did a little hygiene on Rel<> and Rela<> as well, as there was an unused-but-similarly-dubious API there. Also added some C++17 `if constexpr` love. * Removed constexpr 06 September 2022, 17:18:50 UTC
8b9c081 Fixes for Xcode "new" build system. (#6993) 1. TargetExportScript was running into an Xcode bug with its handling of linker flags. Now using XCODE_ATTRIBUTE_EXPORTED_SYMBOLS_LIST as a workaround. 2. Added a missing dependency in Python module definition code. Fixes #6987 02 September 2022, 01:55:43 UTC
d4a61b3 Merge branch 'main' into xtensa-codegen 01 September 2022, 20:53:40 UTC
ce2e7f3 Refactor buffer-unpacking code in PythonExtensionGen (#6991) This moves most of the interesting code into the common module block, so we don't risk duplicating code for extensions that contain multiple function definitions. 01 September 2022, 17:58:57 UTC
d76278e Merge branch 'main' into xtensa-codegen 01 September 2022, 00:19:30 UTC
95e37ee Improve error-handling in Python Extensions (#6986) * Improve error-handling in Python Extensions Currently, Python Extensions don't make any effort to override `halide_error`, so the default (which aborts) is generally used... this is very unfriendly. This modifies the standard Python Extension glue code to hook halide_error, saving the text in a thread local, and then throwing a Python exception after the extension's AOT call is finished (if an error occurred, of course). Also does a drive-by default hooking of `halide_print` to ensure that it goes to whatever Python thinks that `stdout` is. (Note that it would be really nice if we could use closures of some sort for halide_error, halide_print, etc so that we could save context in the actual Python module, rather than in a thread-local global var, but this currently isn't possible without nontrivial refactoring in the Halide runtime.) * Make Windows happy * Remove dangling code bits * Allow defeating of error-handler via HALIDE_PYTHON_EXTENSION_OMIT_ERROR_AND_PRINT_HANDLERS 31 August 2022, 21:44:17 UTC
e531e24 Fix markdown links (#6988) 31 August 2022, 16:55:51 UTC
386e1ee Add `add_halide_runtime` rule (#6985) Fixes #6981 30 August 2022, 20:19:58 UTC
e7c1c86 Add test for _Halide_target_export_single_symbol (#6983) Add test for _Halide_target_export_single_symbol 29 August 2022, 23:09:20 UTC
c24b406 Add add_halide_python_extension_library() rule (#6979) * Add add_halide_python_extension_library() rule This adds a rule to create a single Python extension library from one (or more) halide_library rules. This allows you to package multiple Halide filters into a single Python module, which is nice because (1) being able to organize is good, and (2) all the filters in a single Python extension module share the same Halide runtime, including (e.g.) thread pools and method overrides. (It also removes the just-recently-added PYTHON_EXTENSION_LIBRARY option from the add_halide_library rule, as this new rule is better and more flexible in pretty much every way.) This modifies the content of our `python_extension` output in such a way that existing uses should be completely unaffected, but defining the right preprocessor macros allows us to split the function wrappers up from the method-definition declaration, so we don't have to generate any new code artifiacts to make this work. Partially addresses #6956. * Omits -D in target_compile_definitions * be explicit about setting to empty * Add quotes * Add comments re BUILD_INTERFACE * Add MODULE_NAME comment * Remove "defined in HalideGeneratorHelpers.cmake" * Add comment re add_halide_runtime() * osx, macos, darwin, oh m * blankity blank blank * Use OBJECT library instead * Add comment about X-macros * Update HalideGeneratorHelpers.cmake 29 August 2022, 20:48:23 UTC
5a09dda Fix XCode by wrapping weights in an OBJECT library (#6977) The XCode "new build system" doesn't like generated source files to be associated with more than one target. Going through an OBJECT library like this fixes that problem, but also saves us a compilation, so it's a good thing to do anyway. Fixes #6976 26 August 2022, 00:42:26 UTC
50018c4 Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. (#6973) * Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. 25 August 2022, 23:01:39 UTC
96d0c94 Merge branch 'main' into xtensa-codegen 25 August 2022, 21:03:13 UTC
4877b9f Lower saturating_cast in bounds inference (#6970) * lower saturating_cast in bounds inference * openGL fix to saturating_cast 24 August 2022, 16:21:24 UTC
068f1e2 Don't cache Halide_ASAN_ENABLED (#6969) check_cxx_symbol_exists saves its output in the cache and does not run if its destination variable is defined. This is OK when used to test something that necessitates a totally fresh configuration, like any property of the target architecture, which would require changing the toolchain file. However, the result here can change if someone just modifies CMAKE_CXX_FLAGS, so it can get out of sync in some cases. 24 August 2022, 00:35:10 UTC
eacadb6 Use CMake target to handle vendored SPIRV headers (#6968) 24 August 2022, 00:35:00 UTC
940a596 Merge branch 'main' into xtensa-codegen 23 August 2022, 23:21:01 UTC
2f0957b CMake packaging fixes (#6966) * Add Halide_ASAN_ENABLED to package. * Fix handling of optional components. Make PNG/JPEG optional. * Make it easier to find HalideHelpers Before this change, users would either need to set Halide_ROOT to a Halide installation path or add said path to CMAKE_PREFIX_PATH. If they tried to use a different mechanism, like setting Halide_DIR or directly annotating their find_package call with HINTS or PATHS, then it would fail to find HalideHelpers.cmake. Adding this hint inside HalideConfig.cmake makes the package more robust, while still respecting the more powerful Halide_ROOT and CMAKE_PREFIX_PATH variables. * Delete undocumented variables in HalideConfig Some of our package's internal variables and macros leak out into user builds. We don't want users to use any of these. We might hit Hyrum's Law here, but I hope not. Users of these variables and macros should seek other means. 23 August 2022, 19:38:51 UTC
9bb5f63 Merge branch 'main' into xtensa-codegen 23 August 2022, 17:00:10 UTC
ca6319b Some minor top-level CMakeLists.txt reorganization (#6957) * Disables the usage warning when CMAKE_BUILD_TYPE is defined, but explicitly empty. * Overrides C++ standard variables using the cache (CMake 3.21+) * Allows including projects to build our tests, etc. but disables by default via PROJECT_IS_TOP_LEVEL (CMake 3.21+) * Removes misleading distrib target. * Removes deprecated clang-format target (use ./run-clang-format.sh instead) 23 August 2022, 04:07:09 UTC
37f7514 Python: don't crash for repr(Expr()) (#6962) 23 August 2022, 00:40:17 UTC
671b26d Enable deprecations warnings (#6555) * Enable deprecations warnings We currently disable deprecation warnings inside Halide. This re-enables them there, and also inside add_halide_generator(). 23 August 2022, 00:39:28 UTC
f7a30e0 Fix RPATH for Python wheels on macOS (#6958) 22 August 2022, 18:35:26 UTC
dca289e Merge branch 'main' into xtensa-codegen 22 August 2022, 17:28:45 UTC
fd3bec3 [HVX] Fix state_var issue (#6894) * fix HVX state_var issue * abort if host is nullptr 22 August 2022, 16:08:34 UTC
4bcd6fa Remove add_python_stub_extension(), adding the functionality to add_halide_generator() instead (#6952) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM * PyStubs * Update README_cmake.md * Update Generator.h * Update CMakeLists.txt * Revert "Update CMakeLists.txt" This reverts commit ed5bb00283f0e4fbdbea74adf497d4ff93b0c8d1. * fixes * fixes * Update CMakeLists.txt * fixes * fixes * fixes * Remove LIBRARY DESTINATION * Update CMakeLists.txt * fixup packaging Co-authored-by: Alex Reinking <reinking@google.com> 19 August 2022, 21:17:21 UTC
a1cd71c Build fixes for manylinux2014 (#6953) 19 August 2022, 03:55:04 UTC
1068403 Remove add_python_aot_extension() rule in CMake (#6949) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM 17 August 2022, 23:20:43 UTC
dd5fe8d Two quick build fixes (#6950) * ASLog is linked to autoscheduler MODULES; needs PIC * Out-of-source Python bindings just need libHalide, not imageio * Fixes to setup.py 17 August 2022, 22:17:40 UTC
807d988 Handle saturating_cast in compute_expr_cost() (#6947) 17 August 2022, 19:33:38 UTC
5100ad6 Don't throw an exception from generate_filter_main (#6946) 17 August 2022, 01:06:52 UTC
4f5c53c Add/update Python Readme (#6939) * Add/update Python Readme This moves the Python README to the toplevel and reworks it considerably, adding details and updating various bits. Note that the Python documentation here is still incomplete; this is intended as a prelude to adding documentation for Python Generators in a future PR. 16 August 2022, 22:15:53 UTC
back to top