https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
108dcea Add missing print 11 September 2022, 21:32:42 UTC
6dc8d8b Fix some bugs in div_round_to_zero ... and fast_integer_divide_round_to_zero These were never adequately tested, and there were a few issues. 11 September 2022, 00:15:05 UTC
c98f193 Couple small fixes to update RISC V to current LLVM flags and enable vscale use. (#6995) Couple small fixes to update RISC V to current LLVM flags and enable vscale use. Co-authored-by: Steven Johnson <srj@google.com> 10 September 2022, 16:02:48 UTC
a0a1d09 Prohibit C99 VLA usage in runtime code (#7005) * Prohibit C99 VLA usage in runtime code AFAICT we aren't doing this in Halide at present, but some experimental code in Google runtime was doing so; this caused some issues with some experimental Clang patches, but also was never really intended to be used in the first place. Adding the flag here to be sure no unintended use creeps back in. While I was there, took the time to ensure that the flags for runtime are unified across CMake and Make. * oops 10 September 2022, 00:07:12 UTC
4e352d3 Clean up Adams2019 CMake file (#7003) 09 September 2022, 16:49:26 UTC
bd74f94 Log target info in performance_fast_pow (#6997) (#6998) Try to gather info to track down heisenbug 08 September 2022, 00:40:05 UTC
e5069ef Apply _Halide_place_dll() to _Halide_gengen (#6999) (#7000) 07 September 2022, 21:15:58 UTC
1644e64 [Codegen] Adapt ModuleAddressSanitizerPass/ModuleSanitizerCoveragePass renaming (#6996) https://github.com/llvm/llvm-project/commit/93600eb50ceeec83c488ded24fa0fd25f997fec6 renamed ModuleAddressSanitizerPass to AddressSanitizerPass. https://github.com/llvm/llvm-project/commit/4c18670776cd6ac31099a455b2b22b38b0408006 renamed ModuleSanitizerCoveragePass. 07 September 2022, 18:28:08 UTC
cbe2e63 Fix compiler warnings in Elf.cpp (#6992) * Fix compiler warnings in Elf.cpp Some versions of GCC will complain that there is a possible use of uninitialized field `Sym<>::st_info` here; that's technically true, in that it is a bitfield that we previously set via two calls, so it temporarily could use uninitialized bits, but those would immediately be overwritten by well-defined bits. That said, the API could have been misused, so I collapsed Sym::set_type and Sym::set_bindings into a single call to avoid this warning. While I was there, I did a little hygiene on Rel<> and Rela<> as well, as there was an unused-but-similarly-dubious API there. Also added some C++17 `if constexpr` love. * Removed constexpr 06 September 2022, 17:18:50 UTC
8b9c081 Fixes for Xcode "new" build system. (#6993) 1. TargetExportScript was running into an Xcode bug with its handling of linker flags. Now using XCODE_ATTRIBUTE_EXPORTED_SYMBOLS_LIST as a workaround. 2. Added a missing dependency in Python module definition code. Fixes #6987 02 September 2022, 01:55:43 UTC
ce2e7f3 Refactor buffer-unpacking code in PythonExtensionGen (#6991) This moves most of the interesting code into the common module block, so we don't risk duplicating code for extensions that contain multiple function definitions. 01 September 2022, 17:58:57 UTC
95e37ee Improve error-handling in Python Extensions (#6986) * Improve error-handling in Python Extensions Currently, Python Extensions don't make any effort to override `halide_error`, so the default (which aborts) is generally used... this is very unfriendly. This modifies the standard Python Extension glue code to hook halide_error, saving the text in a thread local, and then throwing a Python exception after the extension's AOT call is finished (if an error occurred, of course). Also does a drive-by default hooking of `halide_print` to ensure that it goes to whatever Python thinks that `stdout` is. (Note that it would be really nice if we could use closures of some sort for halide_error, halide_print, etc so that we could save context in the actual Python module, rather than in a thread-local global var, but this currently isn't possible without nontrivial refactoring in the Halide runtime.) * Make Windows happy * Remove dangling code bits * Allow defeating of error-handler via HALIDE_PYTHON_EXTENSION_OMIT_ERROR_AND_PRINT_HANDLERS 31 August 2022, 21:44:17 UTC
e531e24 Fix markdown links (#6988) 31 August 2022, 16:55:51 UTC
386e1ee Add `add_halide_runtime` rule (#6985) Fixes #6981 30 August 2022, 20:19:58 UTC
e7c1c86 Add test for _Halide_target_export_single_symbol (#6983) Add test for _Halide_target_export_single_symbol 29 August 2022, 23:09:20 UTC
c24b406 Add add_halide_python_extension_library() rule (#6979) * Add add_halide_python_extension_library() rule This adds a rule to create a single Python extension library from one (or more) halide_library rules. This allows you to package multiple Halide filters into a single Python module, which is nice because (1) being able to organize is good, and (2) all the filters in a single Python extension module share the same Halide runtime, including (e.g.) thread pools and method overrides. (It also removes the just-recently-added PYTHON_EXTENSION_LIBRARY option from the add_halide_library rule, as this new rule is better and more flexible in pretty much every way.) This modifies the content of our `python_extension` output in such a way that existing uses should be completely unaffected, but defining the right preprocessor macros allows us to split the function wrappers up from the method-definition declaration, so we don't have to generate any new code artifiacts to make this work. Partially addresses #6956. * Omits -D in target_compile_definitions * be explicit about setting to empty * Add quotes * Add comments re BUILD_INTERFACE * Add MODULE_NAME comment * Remove "defined in HalideGeneratorHelpers.cmake" * Add comment re add_halide_runtime() * osx, macos, darwin, oh m * blankity blank blank * Use OBJECT library instead * Add comment about X-macros * Update HalideGeneratorHelpers.cmake 29 August 2022, 20:48:23 UTC
5a09dda Fix XCode by wrapping weights in an OBJECT library (#6977) The XCode "new build system" doesn't like generated source files to be associated with more than one target. Going through an OBJECT library like this fixes that problem, but also saves us a compilation, so it's a good thing to do anyway. Fixes #6976 26 August 2022, 00:42:26 UTC
50018c4 Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. (#6973) * Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. 25 August 2022, 23:01:39 UTC
4877b9f Lower saturating_cast in bounds inference (#6970) * lower saturating_cast in bounds inference * openGL fix to saturating_cast 24 August 2022, 16:21:24 UTC
068f1e2 Don't cache Halide_ASAN_ENABLED (#6969) check_cxx_symbol_exists saves its output in the cache and does not run if its destination variable is defined. This is OK when used to test something that necessitates a totally fresh configuration, like any property of the target architecture, which would require changing the toolchain file. However, the result here can change if someone just modifies CMAKE_CXX_FLAGS, so it can get out of sync in some cases. 24 August 2022, 00:35:10 UTC
eacadb6 Use CMake target to handle vendored SPIRV headers (#6968) 24 August 2022, 00:35:00 UTC
2f0957b CMake packaging fixes (#6966) * Add Halide_ASAN_ENABLED to package. * Fix handling of optional components. Make PNG/JPEG optional. * Make it easier to find HalideHelpers Before this change, users would either need to set Halide_ROOT to a Halide installation path or add said path to CMAKE_PREFIX_PATH. If they tried to use a different mechanism, like setting Halide_DIR or directly annotating their find_package call with HINTS or PATHS, then it would fail to find HalideHelpers.cmake. Adding this hint inside HalideConfig.cmake makes the package more robust, while still respecting the more powerful Halide_ROOT and CMAKE_PREFIX_PATH variables. * Delete undocumented variables in HalideConfig Some of our package's internal variables and macros leak out into user builds. We don't want users to use any of these. We might hit Hyrum's Law here, but I hope not. Users of these variables and macros should seek other means. 23 August 2022, 19:38:51 UTC
ca6319b Some minor top-level CMakeLists.txt reorganization (#6957) * Disables the usage warning when CMAKE_BUILD_TYPE is defined, but explicitly empty. * Overrides C++ standard variables using the cache (CMake 3.21+) * Allows including projects to build our tests, etc. but disables by default via PROJECT_IS_TOP_LEVEL (CMake 3.21+) * Removes misleading distrib target. * Removes deprecated clang-format target (use ./run-clang-format.sh instead) 23 August 2022, 04:07:09 UTC
37f7514 Python: don't crash for repr(Expr()) (#6962) 23 August 2022, 00:40:17 UTC
671b26d Enable deprecations warnings (#6555) * Enable deprecations warnings We currently disable deprecation warnings inside Halide. This re-enables them there, and also inside add_halide_generator(). 23 August 2022, 00:39:28 UTC
f7a30e0 Fix RPATH for Python wheels on macOS (#6958) 22 August 2022, 18:35:26 UTC
fd3bec3 [HVX] Fix state_var issue (#6894) * fix HVX state_var issue * abort if host is nullptr 22 August 2022, 16:08:34 UTC
4bcd6fa Remove add_python_stub_extension(), adding the functionality to add_halide_generator() instead (#6952) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM * PyStubs * Update README_cmake.md * Update Generator.h * Update CMakeLists.txt * Revert "Update CMakeLists.txt" This reverts commit ed5bb00283f0e4fbdbea74adf497d4ff93b0c8d1. * fixes * fixes * Update CMakeLists.txt * fixes * fixes * fixes * Remove LIBRARY DESTINATION * Update CMakeLists.txt * fixup packaging Co-authored-by: Alex Reinking <reinking@google.com> 19 August 2022, 21:17:21 UTC
a1cd71c Build fixes for manylinux2014 (#6953) 19 August 2022, 03:55:04 UTC
1068403 Remove add_python_aot_extension() rule in CMake (#6949) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM 17 August 2022, 23:20:43 UTC
dd5fe8d Two quick build fixes (#6950) * ASLog is linked to autoscheduler MODULES; needs PIC * Out-of-source Python bindings just need libHalide, not imageio * Fixes to setup.py 17 August 2022, 22:17:40 UTC
807d988 Handle saturating_cast in compute_expr_cost() (#6947) 17 August 2022, 19:33:38 UTC
5100ad6 Don't throw an exception from generate_filter_main (#6946) 17 August 2022, 01:06:52 UTC
4f5c53c Add/update Python Readme (#6939) * Add/update Python Readme This moves the Python README to the toplevel and reworks it considerably, adding details and updating various bits. Note that the Python documentation here is still incomplete; this is intended as a prelude to adding documentation for Python Generators in a future PR. 16 August 2022, 22:15:53 UTC
63d563f Export HalidePythonExtensionHelpers.cmake for installs (#6941) * Export HalidePythonExtensionHelpers.cmake for installs * oops * fixes * Fix broken code in target_export_script() * oops #2 * Add WITH_SOABI to stubs as well as AOT * More fixes * Update CMakePresets.json * Update CMakePresets.json 16 August 2022, 21:38:42 UTC
52b91a4 Add minimal useful implementation of extracting and concatenating bits (#6928) * Minimal approach to making Deinterleave correct for Reinterpret * Add minimal useful implementation of extracting and concatenating bits * clang-tidy * More clang-tidy fixes * Add missing error message * Add low-bit-depth noise test * Add test to cmake build * Fix power-of-two check * Remove dead object * Add little-endian comment to reinterpret IR node * Simplify concat_bits of single arg * Add missing second arg * Fix concat_bits call Co-authored-by: Andrew Adams <anadams@adobe.com> 14 August 2022, 17:24:30 UTC
f60a8fb Fix badly-merged CMakePresets.json file (#6936) 12 August 2022, 20:45:37 UTC
5e8f97b Add ASAN support to CMake via toolchain file (#6920) Add ASAN support Co-authored-by: Alex Reinking <reinking@google.com> 11 August 2022, 22:57:54 UTC
b734957 Add build & test presets for release and debug CMake builds (#6934) Also, drive-by rename of 'default' to 'base' to better imply the inheritance 11 August 2022, 19:35:28 UTC
4cdc2a1 Tutorial 10 needs to be skipped for Python when targeting Wasm (just as non-Python does) (#6932) * Tutorial 10 needs to be skipped for Python when targeting Wasm (just as non-Python does) * fixes * Update CMakeLists.txt 11 August 2022, 19:30:57 UTC
43e6a26 Rework internal PYTHONPATH maintenance (#6922) * Rework PYTHONPATH * Move pure-Python file copying logic to build time. * Use TARGET_RUNTIME_DLLS to copy all DLLs instead of just Halide. * Ensure that the last path component for Halide_Python is always `halide` * Simplify __init__.py now that it's copied to build tree * Add helper to de-duplicate PYTHONPATH test logic Fixes #6870 Co-authored-by: Alex Reinking <alex.reinking@gmail.com> Co-authored-by: Alex Reinking <reinking@google.com> 10 August 2022, 22:05:11 UTC
92de4a1 Halide::Error should not extend std::runtime_error (#6927) * Halide::Error should not extend std::runtime_error Unfortunately, the std error/exception classes aren't marked for DLLEXPORT under MSVC; we need our Error classes to be DLLEXPORT for libHalide (and python bindings). The current situation basically causes MSVC to generator another version of `std::runtime_error` marked for DLLEXPORT, which can lead to ODR violations, which are bad. AFAICT we don't really rely on this inheritance anywhere, so this just eliminates the inheritance entirely. (Note that I can't point to a specific malfunction resulting from this, but casual googling based on the many warnings MSVC emits about the current situation has me convinced that it needs addressing.) * noexcept 10 August 2022, 17:30:46 UTC
1bf1599 Make saturating_cast an intrinsic (#6900) * Make saturating_cast an intrinsic * handle saturating_cast in Bounds.cpp + add bounds tests * update saturating_cast CodeGen * with_lanes should work on intrinsics as well * lift to saturating_cast in FindIntrinsics * update intrinsics test for u16_sat * better sat_cast(widen(expr)) handling in find_intrinsics * simplify bounds of saturating_cast + update is_monotonic 08 August 2022, 21:24:16 UTC
8794fac Make use of CMake 3.22 features (#6919) * Remove AddCudaToTarget.cmake * Remove MakeShellPath.cmake * Use CheckLinkerFlag in TargetExportScript * Use DEPFILE for all generators * Use REQUIRED with find_program, where applicable * Use REQUIRED with find_library, where applicable * Use CMake 3.21 cache behavior in HalideTargetHelpers.cmake * Replace uses of get_filename_component with cmake_path * Rework BLAS detection in linear_algebra app * Drive-by: fix autotune_loop.sh install rule. * Fix CBLAS header in linear_algebra test_halide_blas 08 August 2022, 20:44:32 UTC
9ca7560 Fix wrong install path for *.py files (#6921) * Fix wrong install path for *.py files We were looking in a nonexistent dir, so we never copied `__init__.py` as we should have. * Update CMakeLists.txt 05 August 2022, 02:20:29 UTC
256c4d9 Fix bug when realize condition depends on tuple call (#6915) If the realization is tuple-valued, and the condition on the realization uses a tuple call (index != 0), then the condition wasn't getting resolved during the split_tuples pass. The cause was a missing mutate call. 04 August 2022, 20:21:01 UTC
ffa2c36 Fix two warnings found with clang 16 (#6918) - variable 'count' set but not used - warning: use of bitwise '|' with boolean operands 04 August 2022, 20:10:37 UTC
3a04fc0 Remove unused GHA and packaging workflows. (#6917) 04 August 2022, 14:00:45 UTC
cc44ee5 Upgrade CMake minimum version to 3.22 (#6916) Fixes #6910. 04 August 2022, 01:40:22 UTC
857b045 LICENSE.txt: add BLAS license. (#6914) 03 August 2022, 22:24:43 UTC
a893d5e LICENSE.txt: add spirv license (#6913) 03 August 2022, 22:23:44 UTC
0072946 LICENSE.txt: Include full text of Apache 2.0 license (not just the 'header' version) (#6912) 03 August 2022, 22:23:28 UTC
88e7229 Start developing pip package (#6886) Co-authored-by: Lukas Trümper <lukas.truemper@outlook.de> 02 August 2022, 20:55:53 UTC
dd391e6 Fix broken Makefile rules for autoschedulers on OSX (#6906) * Fix broken Makefile rules for autoschedulers on OSX A few issues here: - Make was building the plugins as .dylib on OSX, but they should have been .so to match Linux (and just on general principles) - On OSX, explicitly linking libHalide.dylib into a plugin means that it will load its own copy of libHalide, which is bad, because it means the plugin doesn't share the same set of globals. We need to omit that explicit dependency and allow it to just find the exported symbols at load time. - Add a test to verify the fix; run it everywhere even though it should only have been failing for Make-build OSX builds. Finally, let me add that we really need to set a sunset date for supporting Make in Halide. The Makefiles aren't really maintained properly anymore, and when something subtle goes wrong, it takes an unreasonable amount of time to debug for something that is no longer our canonical build tool. * Use order-only prerequisites * Remove new load_plugin.cpp test Not worth the complexity for the extra test coverage. 02 August 2022, 20:32:46 UTC
2239119 Fix autoscheduling trivial lut wrappers (#6905) * Fix autoscheduling trivial lut wrappers Fixes #6899 * trigger buildbots Co-authored-by: Steven Johnson <srj@google.com> 02 August 2022, 15:24:39 UTC
8871404 Allow AMX instructions with K dimension larger than 4 bytes (#6582) * recognize the patterns used for the RHS matrix * make 1d tile matcher more robust * put getting rhs tile's index into a separate func * expand the tests used in correctness check * add exclamation mark * remove unused vars * run format and tidy * check for null before using IR in the next step * check if the broadcast was found * llvm below 13 is no longer supported * replace single pattern with commutative permutations * check if the stride is an `IntImm`, otherwise reject pattern * apply clang-format-13 * rename wild_i32 -> v2 * check if v1 could be the stride value * add more detail to a receiving a bad type * added short explanation of the right-hand matrix layout * added explanation for where the 4 comes from * provide further documentation as to the layout of AMX * add comments for expected patterns to get_3d_rhs_tile_index * Document the matched pattern Co-authored-by: Steven Johnson <srj@google.com> 01 August 2022, 23:23:28 UTC
703a738 Upgrade clang-format and clang-tidy to v14 (v2) (#6902) 01 August 2022, 20:08:13 UTC
e35654b Don't try to fold saturating_sub of VectorReduce (#6896) don't fold saturating_sub of VectorReduce 01 August 2022, 17:35:48 UTC
e03b0e0 [Codegen_LLVM] Annotate LLVM IR functions with `nounwind`/`mustprogress` attributes (#6897) My reasoning is as follows, please correct me if i'm wrong: 1. Halide-generated code never throws exceptions 2. Halide-generated code always `call`s (as opposed to `invoke`s) the functions, there is no exception-safety RAII 3. Halide loops are meant to have finite number of iterations, they aren't meant to be endless and side-effect free 4. Halide (IR) assertions *might* abort. 5. Likewise, external callees *might* abort. (???) Therefore, when not in presence of external calls, it is obvious that (1) no exception will be unwinded out of the halide-generated function, (2) none of the loops will end up being endless with no observable side-effects. ... which is the semantics that is being stated by the LLVM IR function attributes `nounwind`+`mustprogress`. I'm less clear as to what are the prerequisites on the behavior of the external callees, but i do believe that they must also at least not unwind. I guess they are also at least required to either return or abort eventually. 01 August 2022, 16:19:30 UTC
0739045 [Simplify] Drop no-op single-input identity shuffles (#6901) 01 August 2022, 16:18:55 UTC
6cc77b2 Add `auto_schedule` label to Adams2019 and Li2018 tests in CMake (#6898) * Add `auto_schedule` label to Adams2019 and Li2018 tests in CMake These were ~never getting tested on the buildbots (and still aren't, I need to update it to run `auto_schedule` tests) but conceptually these tests should be in the same group as for Mullapudi. Also, drive-by fix to broken test_apps_autoscheduler injected in https://github.com/halide/Halide/pull/6861. * trigger buildbots 29 July 2022, 22:47:33 UTC
9c25902 [vulkan phase1] Add SPIR-V IR (#6882) * Import SPIRV-IR from personal branch * Refactor SPIR-V IR into separate header / source files. * Refactory SPIR-V factory methods. Fix SPIR-V interface library and header paths. Add SPIR-V internal test. * Hookup internal SPIRV IR test * Fixes and cleanups to address PR #6882 Refactor logic of SPIR-V dependency to make fetch dependecy optional Change SPIR-V fetch dependency to avoid building and just populate contents Change SPIR-V internal test to always link against method ... only enabled if WITH_SPIRV is defined Add missing SPIRV target feature * Update src/CMakeLists.txt Co-authored-by: Alex Reinking <reinking@google.com> * Add missing iostream header when WITH_SPIRV is undefined * Fix declaration ordering for TARGET_SPIRV option so that dependencies get triggered * Turn on FETCH_SPIRV_HEADERS by default to get build to pass for now * Fix path finding logic for SPIR-V header path from populated fetch dependency * Revert back to Halide_SPIRV target name * Don't use imported interface for SPIR-V. Use Halide_SPIRV naming since target is defined before Halide itself. * Add local copy of SPIR-V header file, along with license and readme. Update CMake rules to use local include path by default. * Make SPIR-V include path a system path to avoid clang format/tidy processing * Remove SpirvIR.h header file from being included with Halide.h (since it's only used internally for CodeGen) * Add ./dependencies/spirv to clang format ignore file * Add comment about *not* including internally used headers like SpirvIR.h * Refactor is_defined() asserts into check_defined() for reuse * Add comment to SpirvIR.h header clarifying this file should not be exported. Fix formatting to avoid single line if statements. Use reserve for constructing vector components * Rename hash_* methods to make_*_key methods (since they construct a key and don't actually hash the value) Fix typo on components * Clang format/tidy pass * Fix formatting for more single-line if statements * Disable TARGET_SPIRV by default for now Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Alex Reinking <reinking@google.com> 29 July 2022, 22:05:35 UTC
b9a3356 Remove (most) of the env var usage from Adams2019 (#6861) * Move ASLog.cpp/.h to common/ * Add trivial Parsing utility & use it * Update ParamParser.h * fixes * wip * fixes * Fixes * clang-format * Update Makefile * Remove may_subtile * Update Cache.cpp * Update Cache.cpp * Update AutoSchedule.cpp * Update AutoSchedule.cpp 27 July 2022, 23:32:14 UTC
3859b36 Add support for generating x86 sum-of-absolute-difference reductions (#6872) 27 July 2022, 22:12:36 UTC
c8b811a Fixes to allow compiling with LLVM16 (#6889) 27 July 2022, 20:20:16 UTC
e3e169d Rewrite PythonExtensionGen to be C++ based (#6888) * Rewrite PythonExtensionGen to be C++ based This is intended as an alternative to #6885 -- this is even *more* gratuitous, but: - We have ~always compiled Python extensions using C++ anyway - This code is arguably terser, cleaner, and safer (the cleanups happen via dtors) - The code size difference is negligible (~300 bytes out of 160k for addconstant.cpython-39-darwin.so) * Update PythonExtensionGen.cpp 27 July 2022, 16:21:44 UTC
7821212 Add set-host-dirty/copy-to-host to PythonExtensionGen (#6869) * Add set-host-dirty/copy-to-host to PythonExtensionGen See https://github.com/halide/Halide/issues/6868: Python Buffers are host-memory-only, so if the AOT-compiled halide code runs on (say) GPU, it may fail to copy the inputs to device and/or the results back to host. This fixes that. (We still need a solution that allows for lazy copies, but that will require adding another protocol that supports it.) * Update PythonExtensionGen.cpp 25 July 2022, 16:32:24 UTC
5e69ad9 [Codegen_LLVM] Define all the things (#6866) Long-term plan for LLVM is to get rid of `undef`, and replace it with zero-initialization, err, `poison`, because it has nicer semantics. Everywhere we use `undef` as a placeholder in shuffle (be it either for a second operand, or undef shuffle mask element), or as a base 'empty' vector we are about to fully override via insertelement, we can just switch those to poison nowadays. The scary part is the `Call::undef` semantics/lowering, perhaps it will need to be `freeze poison`. 25 July 2022, 16:24:56 UTC
11a049c #6863 - Fixes to make address sanitizer happy for internal runtime classes (#6880) * Fixes to make address sanitizer happy. Fixed initialization defects in StringStorage that could cause buffer overruns Fixed memory leaks within RegionAllocator and BlockAllocator Added system memory allocation tracking to all internal runtime tests. * Clang Tidy / Format pass * Fix formatting to use braces around if statements Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 22 July 2022, 22:20:47 UTC
4770495 Ensure $CMAKE_{lang}_OUTPUT_EXTENSION is set before using it (#6879) Ensure CMAKE_{lang}_OUTPUT_EXTENSION is set before using it Co-authored-by: Shoaib Kamil <kamil@adobe.com> 22 July 2022, 14:36:50 UTC
c904c53 Refactor/cleanup in Autoscheduler code (#6858) * Move ASLog.cpp/.h to common/ * Add trivial Parsing utility & use it * Update ParamParser.h * fixes * fixes 21 July 2022, 21:24:51 UTC
06fcf94 Fix error in Makefile for Adams2019 on OSX (#6877) We erroneously link in the dylib and also dynamically load it, causing an error. We should skip the linkage and always load dynamically.. 21 July 2022, 19:14:39 UTC
04c465b [Codegen] Fail to codegen `Call::undef`, just like `Call::signed_integer_overflow` (#6871) See discussion in https://github.com/halide/Halide/pull/6866. It's not obvious if that codepath is ever hit, let's optimistically assume that it is not. If this turns out to be not true, we'll have to deal with a more complicated question of the proper lowering for it, can it be `poison`, or must it be a `freeze poison`. 21 July 2022, 19:11:22 UTC
8b5486b [Codegen_LLVM] Radically simplify `visit(const Reinterpret *op)` (#6865) 1. LLVM IR `bitcast` happily bitcasts between vectors and scalars: https://godbolt.org/z/9zqx11rna 2. `ptrtoint` already implicitly truncates/zero-extends if the int is larger than the pointer type: https://llvm.org/docs/LangRef.html#ptrtoint-to-instruction 3. `inttoptr` already implicitly truncates/zero-extends if the int is larger than the pointer type: https://llvm.org/docs/LangRef.html#inttoptr-to-instruction So we don't need to do any of that 'special' handling. 21 July 2022, 16:35:40 UTC
9a94756 Use pmaddubsw 8-bit horizontal widening adds (Fixes #6859) (#6873) * use pmaddubsw 8-bit horizontal widening adds * add SSE3 versions too * add pmaddubsw tests 21 July 2022, 15:01:16 UTC
967c3bf Fix simd_op_check for top-of-tree LLVM (#6874) * Fix simd_op_check for top-of-tree LLVM * clang-format 20 July 2022, 23:57:27 UTC
51c06b7 Python source reorg (#6867) * Move python binding sources to src/halide/halide_ * Rename native module to halide_ * Fix tests * Avoid copying Python sources * Fix installation rules * Make diff smaller * trigger buildbots * Add issue todo Co-authored-by: Steven Johnson <srj@google.com> 20 July 2022, 22:35:45 UTC
359026a Promote Reinterpret Intrinsic into an Reinterpret IR Node (#6853) * Promote Reinterpret Intrinsic into an Reinterpret IR Node As discussed in https://github.com/halide/Halide/issues/6801#issuecomment-1152731683 I don't think this is complete, there are likely a few more places that need to be taught about it still, altough i think this is mostly it. Note that this only promotes the intrinsic, this does not adjust it's handling, as hinted in: https://github.com/halide/Halide/issues/6801#issuecomment-1155603752 * Silence buildbot warning * Speculative fix for Codegen C failure? * Restore comment * Delete obsolete FIXME * RegionCost: reinterpret is free * LICM: actually adjust the comment 20 July 2022, 00:19:44 UTC
2d907c4 [vulkan phase0] Add adts for containers and memory allocation to runtime (#6829) * Cherry pick runtime internals as standalone commit (preparation work for Vulkan runtime) * Clang format/tidy fixes * Fix runtime test linkage and include paths to not include libHalide * Update test/runtime/CMakeLists.txt Fix typo mismatch for HALIDE_VERSION_PATCH Co-authored-by: Alex Reinking <reinking@google.com> * Add compiler id guard to build options for runtime tests * Avoid building runtime tests on MSVC since Halide runtime headers are not MS compatible Remove CLANG warning flag for runtime test * Change runtime test compile definitions to be PRIVATE. Remove PUBLIC_EXPORTS from runtime test definition. * Add comment about GNU warnings for 'no-builtin-declaration-mismatch' * Change to debug(user_context) for debug messages where context is valid. Wrap verbose debugging with DEBUG_RUNTIME ifdef. Syle pass based on review comments. * Add note explaining why we disable the internal runtime tests on MSVC. * Cleanup cmake logic for disabling runtime internal tests for MSVC and add a status message. * Don't use strncpy for prepend since some implementations may insert a null char regardless of the length used * Workaround varying platform str implementations and handle termination directly. * Clang Tidy/Format pass Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Alex Reinking <reinking@google.com> 15 July 2022, 22:15:18 UTC
b1ca334 Rework autoscheduler API (#6788) (#6838) * Rework autoschduler API (#6788) * Oops * Update test_function_dag.cpp * clang-tidy * trigger buildbots * Update Generator.h * Minor cleanups * Update README_cmake.md * Check for malformed autoscheduler_params dicts * Add alias-with-autoscheduler code, plus tweaks * Update stubtest_jittest.cpp * Update Makefile * trigger buildbots * fixes * Update AbstractGenerator.cpp * Update stubtest_generator.cpp * Update Makefile * Add deprecation warning for HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API * Make AutoschedulerParams a real struct * clang-tidy 15 July 2022, 22:13:50 UTC
24913eb Silence Adams2019 Autoscheduler (#6854) * Make aslog() a proper ostream * Ensure that all `dump()` calls take and use an ostream * Progress Bar only draws as LogLevel >= 1 * clang-format * Rework all aslog(0) statements * Update ASLog.cpp * syntax * Update ASLog.cpp * Revert fancy aslog stuff * Update ASLog.h * trigger buildbots 15 July 2022, 17:55:43 UTC
f9c2cdf Add autoscheduling to the generator_aot_stubuser test (#6855) * Add autoscheduling to the generator_aot_stubuser test * fix test_apps * fix test_apps, again 14 July 2022, 23:26:53 UTC
bdd7114 Fix the PLUGINS argument to properly join multiple arguments (#6851) 13 July 2022, 17:13:11 UTC
13a43c0 Add placeholder code for bfloat16 in Python (#6849) (#6850) * Add placeholder code for bfloat16 in Python (#6849) This is a no-op change; I just want to mark the place(s) in the Python bindings that need attention if/when it becomes possible to support bfloat16 in Python buffers. * Update PyBinaryOperators.h 12 July 2022, 23:08:52 UTC
708a320 Deprecate/remove Generator::get_externs_map() and friends (#6844) * Deprecate/remove Generator::get_externs_map() and friends This is a feature of Generator that was added years ago to allow adding external code libraries in LLVM bitcode form (rather than simply as extern "C" or similar). In theory it allow for better codegen for external code modules (since LLVM has access to all the bitcode for optimization); in practice, we only know of one project that ever used it, and that project no longer exists. Additionally, it tended to be fairly flaky in terms of actual use -- e.g., missing symbols tended to crop up unpredictably. The issues with this feature are likely fixable, but since it hasn't (AFAICT) been used in ~years, we're better off deprecating it for Halide 15 and removing for Halide 16. (If anyone out there is still relying on this feature, obviously you should speak up ASAP.) * Also remove ExternalCode.h & friends * Also remove correctness/external_code.cpp * HALIDE_ALLOW_GENERATOR_EXTERNS_MAP -> HALIDE_ALLOW_GENERATOR_EXTERNAL_CODE 11 July 2022, 23:27:49 UTC
d266e4e Remove Generator::value_tracker and friends (#6845) This is an internal-to-Generator helper that is used to try to detect certain classes of errors when using GeneratorStubs. To the best of my knowledge, it has ~never found a useful error in all of its existence; combined with the very limited usage of GeneratorStubs, I think this code no longer pays for itself, and should be removed. (Note that this was never externally visible, thus no deprecation warnings should be necessary.) 11 July 2022, 22:07:54 UTC
8159dd3 Check RDom::where predicates for race conditions (#6842) Fixes #6808 11 July 2022, 18:39:14 UTC
29ebde9 Better lowering of halving_sub and rounding_halving_add (#6827) * Better lowering of halving_sub and rounding_halving_add Previously, lower_halving_sub and lower_rounding_halving_add both used 9 ops. This change redirects halving_sub to use rounding_halving_add, and redirects rounding_halving_add to use halving_add. In the case that none of these instructions exist natively, this reduces it to 7/8 ops for signed/unsigned halving sub and 6 ops for rounding halving add. More importantly, this lets halving_sub make use of pavgw/b on x86 to reduce it to 3 ops for u8 and u16 inputs. * Make signed rounding_halving_add on x86 use pavgb/w too * Cast result back to signed * Add explanatory comment * Fix comment * Add explanation of signed case 11 July 2022, 18:03:55 UTC
23c4cf1 Rearrange subdirectories in python_bindings (#6835) This is intended to facilitate a few things: - Move all Generators used in tests, apps, etc to a single directory to simplify the build rules (this is especially useful for the work in https://github.com/halide/Halide/pull/6764) - Put all the test and apps stuff under a single directory to facilitate adding some Python packaging that can make integration into Bazel/Blaze builds a bit less painful @alexreinking, does this look like the layout we discussed before? 01 July 2022, 17:10:34 UTC
23a1fa8 Disable testing for apps/linear_algebra on x86-32-linux/Make (#6836) * Disable testing for apps/linear_algebra on x86-32-linux/Make This wasn't biting us before because we were disabling *all* apps/ on x86-32-linux (oops); the recent change to remove python testing under Make also re-enabled this test. TL;DR: this can probably be made to work somehow, but it's not worth debugging, since that case is both pretty nice, and already covered under CMake. It's literally not worth the time to fix. * Update Makefile 01 July 2022, 03:48:04 UTC
6838db0 Remove unused function in callable_generator.cpp (#6834) 30 June 2022, 23:11:21 UTC
b2771c1 Scrub Python from Makefile after buildbot update (#6833) 30 June 2022, 21:26:13 UTC
fac313e Add a new, alternate JIT-call convention (#6777) * Prototype of revised JIT-call convention Experiment to try out a way to call JIT code in C++ using the same calling conventions as AOT code. Very much experimental. * Update Pipeline.h * Add Python support for `compile_to_callable` + make empty_ucon static * Update PyCallable.cpp * Update buffer.py * wip * Update callable.py * WIP * Update custom_allocator.cpp * Update Callable.cpp * Add Generator support for Callables * Update Generator.cpp * Update PyPipeline.cpp * Fixes * Update callable.cpp * Update CMakeLists.txt * create_callable_from_generator * More cleanup * Update Generator.cpp * Fix Python bounds inference * Add Python wrapper for create_callable_from_generator() + Add kwarg support for Callable * Add set_generatorparam_values() + usage * Fix auto_schedule/machine_params parsing The recent refactoring that added `execute_generator` accidentally nuked setting these two GeneratorParams. Oops. Fixed. * Move the type-checking code into a constexpr code * Update Callable.h * clang-tidy * CLANG-TIDY * Add `make_std_function`, + more general cleanup * Update example_jittest.cpp * Update Callable.h * Update Callable.h * More tweaking, smaller CallCheckInfo * Still more cleanup * make_std_function now does Buffer type/dim checking where possible * Add tests for calling `AbstractGenreator::compile_to_callable()` directly * enable exports * Various fixes * Improve fill_slot for Halide::Buffer * kill report_if_error * Update callable_bad_arguments.cpp * Update Pipeline.cpp * Revise error handling * Update Callable.cpp * Update callable.py * Update callable_generator.cpp * Update callable.py * HALIDE_MUST_USE_RESULT -> HALIDE_FUNCTION_ATTRS for Callable 30 June 2022, 20:18:42 UTC
60d2b98 Remove Python bindings from Makefiles (#6821) * Remove Python bindings from Makefiles * Restore test_li2018 in Makefile (now C++-only) * Add dummy `test_python` target for buildbots 30 June 2022, 18:23:48 UTC
ece5fb7 Apply CMAKE_C_COMPILER_LAUNCHER to initmod clang calls (#6831) 30 June 2022, 15:55:12 UTC
d36cd04 Change stub module names in Python to be _pystub rather than _stub (#6830) This is a bit finicky, but making this the default nomenclature will make some downstream usages less ambiguous and a bit easier to manage. (Yes, I realize that #6821 removes the Makefile entirely, but until it lands, it needs fixing there too.) 29 June 2022, 23:15:07 UTC
3e142cf Tweak python apps for better Blaze/Bazel compatibility (#6823) * Tweak python apps/tutorials for better Blaze/Bazel compatibility - Don't write to current directory (rely on an env var to say where to write) - Don't read from arbitrary absolute paths (again, rely on an env var) - Drive-by removal of unnecessary #include in Codegen_LLVM.cpp inside a lambda (!) * Recommended fixes * Revert all changes to tutorial * Revise apps * Remove apps_helpers.py 28 June 2022, 20:55:16 UTC
c12f8a5 Fix for top-of-tree LLVM (#6825) 28 June 2022, 16:52:25 UTC
e0a9825 Update presets to format version 3 (#6824) 28 June 2022, 15:46:06 UTC
feba77c Rework .gitignore (#6822) * reorganize .gitignore * Add exclusions for CMake build * .gitignore: comment, drop stale rules * fully and precisely exclude CMake build tree * add debugging directions to .gitignore * ignore CMake install tree * Sort groups 28 June 2022, 15:39:48 UTC
back to top